OpenJourney: Midjourney, but Open Sourceopen-journey.github.io
celestialcheese 11 days ago [-]
If anyone wants to try it out without having to build and install the thing - https://replicate.com/prompthero/openjourney

I've been using openjourney (and MJ/SD) quite a bit, and it does generate "better" with "less" compared to standard v1.5, but it's nowhere close to Midjourney v4.

Midjourney is so far ahead in generating "good" images across a wide space of styles and subjects using very little prompting. While SD requires careful negative prompts and extravagant prompting to generate something decent.

Very interested in being wrong about this, there's so much happening with SD that it's hard to keep up with what's working best.

MintsJohn 11 days ago [-]
I've been thinking that for months, but recently swung towards being more optimistic about SD again, everything midjourney looks midjourney while SD allows you to create images in any style. MJ really needs to get rid of that MJ style, make it optional as it's undeniably pretty, it's just becoming a little much.

But I still feel 2.x is somehow a degradation of 1.x, its hard to get something decent out of it. The custom training/tuning and all is nice (and certainly the top rain to use SD over MJ, many use cases MJ just can't do) but it should not be used as a band-aid for appearantly inherent shortcomings in the new clip layer (I'm assuming this is where the largest difference comes from, since the Unet is trained on largely the same dataset as 1.x).

throwaway675309 11 days ago [-]
To be fair that's the default style of MJ, you're seeing that a lot because most users don't take the time to add style modifiers to their prompts.

If you add qualifiers such as soft colors, impressionistic, western animation, stencil, etc. you can steer mid journey towards much more personalized styles.

brianjking 11 days ago [-]
Yeah, a lot of Midjourney images are very clearly Midjourney images. Does Midjourney have inpainting/outpainting yet? I admit it's the offering I've evaluated the least.

Midjourneys upscaled images to their current max offering look fantastic, that's for sure. My wife generates some really great stuff just for fun.

lobocinza 10 days ago [-]
It has inpainting and scaffolding at least.
lobocinza 10 days ago [-]
smeagull 11 days ago [-]
SD really shat the bed, and a bunch of projects appear to have stuck with 1.5.
michaelbrave 10 days ago [-]
I think 2.0 has potential still, it works with textual inversion type models much better, which can kinda play nice with each other, so given enough of those I imagine you can get some cool stuff with it. I've also heard it does negative prompts much better, so those are less optional in 2.0

But yeah for now, all my custom models are 1.5 so I've yet to fully upgrade yet, most of the community seems to be doing similar at the moment.

lobocinza 10 days ago [-]
MJ is easy to get started with and works well out of the box. SD is for those that want to do things that MJ can't like embeddings.
ted_bunny 11 days ago [-]
What's SD? No one's said.
agf 11 days ago [-]
tehsauce 11 days ago [-]
stable diffusion
chamwislothe2nd 11 days ago [-]
Every midjourney image has the same feeling to it. A bit 1950s sci-fi artist. I guess it's just that it all looks airbrushed? I can't put my finger on it.
cwkoss 11 days ago [-]
Yeah, I think Midjourney makes fewer unsuccessful images, but harder to get images that dont match their particular style.
TillE 11 days ago [-]
I don't know if that was Midjourney's intent, but it seems like a smart approach. Instead of trying to be everything to everyone and generating quite a lot of ugly garbage, you get consistently good-looking stuff in a certain style. I'm sure it helps their business model.
another-dave 10 days ago [-]
Feels like it's the Instagram model for prompt-generated images.

Anyone can get a camera phone, take a picture and use some free software (e.g. gimp) to get great results in post-processing.

Most non-expert users though want to click on a few pre-defined filters, find one they like & run with it, rather than having more control yet poorer results (precisely because they _aren't_ experts).

IshKebab 11 days ago [-]
It's the science magazine article illustration look.
brianl047 11 days ago [-]
Sounds great

If Midjourney applies this to all their artwork then maybe it alleviates some of the ethical concerns (Midjourney then has a "style" independent of the training data)

VulgarExigency 10 days ago [-]
But the style isn't independent of training data. If you don't feed Midjourney images in that style, it's not going to come up with it independently.
lobocinza 10 days ago [-]
I've played a lot with it lately and that just not true. If you play with styles, colors, angles, views you have a lot of control about how the imagine will look. It can emulate pretty much all mainstream aesthetics.
ImprobableTruth 11 days ago [-]
I think it's down to having a lot of feedback data due to being a service, SD has its aesthetics ratings, but I assume it pales in comparison.
nickthegreek 11 days ago [-]
This is just a sd checkpoint trained on output of Midjourney. You can load it into a1111 or invokeai for easier usage. If you are looking for new checkpoints, check out the Protogen series though for some really neat stuff.
rahimnathwani 11 days ago [-]
Do you mean this one? https://huggingface.co/darkstorm2150/Protogen_Infinity_Offic...

On the same topic, is there some sort of 'awesome list' of finetuned SD models? (something better than just browsing https://huggingface.co/models?other=stable-diffusion)

liuliu 11 days ago [-]
nickthegreek 11 days ago [-]
Not sure why this is downvoted. Civitai does in fact list a bunch of fine tuned models and can be sorted by highest ranked, liked, downloaded, etc. It is a good resource. Many of the models are also available in the .safetensor format so you dont have to worry about a pickled checkpoint.
lancesells 11 days ago [-]
I didn't downvote but I have to say the images shown on that page are hilariously juvenile. I was a teenager once so I get it but I'm guessing the content is where the downvotes are coming from?
CyanBird 11 days ago [-]
"The internet is for Porn! The internet is for Porn! So grab your dick and double click! For Porn! Porn! Porn!"

Apologies for the bad taste, but I simply love that song, an absolute classic

https://youtu.be/j6eFNRKEROw

Anyhow, regarding civai, you can filter out the NSFW models quite easily

Ought be noted that protogen 5.3 even when it is not an explicit porn model, it was trained with explicit models... So it can be... Raucy as well

narrator 11 days ago [-]
Looking at this site, I would argue that the canonical "hello world" of an image diffusion model is a picture of a pretty woman. The canonical "hello world" for community chatbots that can run on a consumer GPU will undoubtedly be an AI girlfriend.
alephaleph 10 days ago [-]
Lena all over again
rahimnathwani 11 days ago [-]
Thanks.

BTW I love your app! At my desk I use Automatic1111 (because I have a decent GPU), but it's so nice to have a lean back experience on my iPad. Also, even my 6yo son can use it, as he doesn't need to manipulate a mouse.

dr_dshiv 11 days ago [-]
Wow. Is there something like this for text models?
madeofpalk 10 days ago [-]
why are they all big breasted women?
nickthegreek 11 days ago [-]
Here are the protogen models https://civitai.com/user/darkstorm2150
pdntspa 11 days ago [-]
I just gave Protogen a spin and the diversity of outputs it gave me was abysmal. Every seed for the same (relatively open-ended) prompt used the same color scheme, had the same framing, and the same composition. Whereas with SD 1.5/2.1, the subject would be placed differently in-frame, color schemes were far more varied, and results were far more interesting compositionally. (This is with identical settings between the two models and a random seed)

So unless you want cliche-as-fuck fantasy and samey waifu material, classic SD seems to do a much better job.

vintermann 10 days ago [-]
Yes, protogen is based on merging of checkpoints. The checkpoints it's merged from are also mostly based on merging. Tracing the degree of ancestry back to fine tuned models is hard, but there's a ton of booru-tagged anime and porn in there.

If there's one style I dislike more than the bland Midjourney style, it's the super-smooth "realistic" child faces on adult bodies that protogen (and its own many descendants) spit out.

quitit 11 days ago [-]
It's actually worse, because automatic and invoke will let you chain up GANs to fix faces and the like, and both have trivial installation procedures.

This offering is like going back to August 2022.

152334H 11 days ago [-]
HN is just incredibly bad at figuring out what kind of ML projects are worth getting excited about and what aren't.

MJ v4 doesn't even use Stable Diffusion as a base [0]; a fine-tune of the latter will never come close to achieving what they do.

[0] - https://discord.com/channels/729741769192767510/730095596861...

kossTKR 11 days ago [-]
It doesn’t use stablediffusion?

I thought everything besides dall-e was sd under the hood.

tsurba 11 days ago [-]
Mj earlier versions were around before SD came out. Before dall-e 2 too, but after 1 IIRC. So I assume they have their own custom setup. Perhaps based on dall-e 1 paper originally (not weights as they were never published) and improved from there.
kossTKR 10 days ago [-]
Interesting i thought stable diffusion was the only other "big player" besides OpenAI because of the expenses in training and extrapolating from papers / new research.

Is Midjourney heavily funded? Because if they can battle SD why aren't we seeing lots of people doing the same, even in the Open Source space?

Eduard 11 days ago [-]
I didn't understand a single word you said :D
lxe 11 days ago [-]
sd checkpoint -- stable diffusion checkpoint. a model weights file that was obtained by tuning the stablediffusion weights file using probably something like dreambooth on some number of midjourney-generated images.

a1111 / invokeai -- stable diffusion UI tools

Protogen series -- popular stablediffusion checkpoints you can download so you can generate content in various styles

throwaway64643 10 days ago [-]
> This is just a sd checkpoint trained on output of Midjourney

Which is sub-optimal -> bad. You don't want to train on output from an AI because you'll end up with a worse version of whatever that AI is already being bad at (hands, foot, and countless other things). This is the AI feedback loop that people have been talking about.

So instead of figuring out what Midjourney has done to get such good result, people just blatantly straight copied those results and fed them directly into the AI, as true as the art thief stereotype they are.

version_five 11 days ago [-]
The huggingface element of these annoys me. Reading the other comments, this is just a stable diffusion checkpoint, so I should be able to download it and not use the diffusers library or whatever other HF stuff. But it's frustrating that it's tied to a for profit ecosystem like this.

I suppose pytorch is / was Facebook, but if feels more arms length. I don't have to install and run a facebook cli to use it (nobody get any ideas).

You don't need a HF cli, you just need to use git LFS (I believe now part of git) to pull the files off of HF (unfortunately still requiring an account with them). It would be nice to see truly open mirrors for this stuff that don't have to involve any company.

rattt 11 days ago [-]
You don't need a HF account to download the checkpoint, can be downloaded straight from the website/browser, direct url: https://huggingface.co/openjourney/openjourney/resolve/main/...
version_five 10 days ago [-]
Is it possible to download with curl or git lfs (or other "free" command line tool) with no login? I couldn't find a way to do that with the original sd checkpoints.
rattt 10 days ago [-]
Yes works with anything now, they removed the manual accepting of the terms and auth requirement some months after release.
version_five 10 days ago [-]
I will try it, thanks!
stainablesteel 11 days ago [-]
i don't think it's at the point where most individuals can financially support the model training, its a company doing all this because it requires the consolidated funds of a business

give it 10 years and this will change

notpushkin 11 days ago [-]
Maybe crowdfunding is an option today?
zargon 10 days ago [-]
There was a group that tried to do this recently and Kickstarter shut them down.
Rastonbury 10 days ago [-]
You can download the checkpoint right from hugging face and diffusers is a library you can use for free, I'm not sure what the issue is here, that people need an account?
titaniumtown 11 days ago [-]
Someone should do this but for chatGPT. massive undertaking though

Edit: https://github.com/LAION-AI/Open-Assistant

vnjxk 11 days ago [-]
look up "open assistant"
titaniumtown 11 days ago [-]
EamonnMR 11 days ago [-]
If it's using a RAIL license isn't it not open source?
nickvincent 11 days ago [-]
Yeah, that's a fair critique, I think the short answer is depends who you ask.

See this FAQ here: https://www.licenses.ai/faq-2

Specifically:

Q: "Are OpenRAILs considered open source licenses according to the Open Source Definition? NO."

A: "THESE ARE NOT OPEN SOURCE LICENSES, based on the definition used by Open Source Initiative, because it has some restrictions on the use of the licensed AI artifact.

That said, we consider OpenRAIL licenses to be “open”. OpenRAIL enables reuse, distribution, commercialization, and adaptation as long as the artifact is not being applied for use-cases that have been restricted.

Our main aim is not to evangelize what is open and what is not but rather to focus on the intersection between open and responsible licensing."

FWIW, there's a lot of active discussion in this space, and it could be the case that e.g. communities settle on releasing code under OSI-approved licenses and models/artifacts under lowercase "open" but use-restricted licenses.

kmeisthax 11 days ago [-]
My biggest critique of OpenRAIL is that it's not entirely clear that AI is copyrightable[0] to begin with. Specifically the model weights are just a mechanical derivation of training set data. Putting aside the "does it infringe[1]" question, there is zero creativity in the training process. All the creativity is either in the source images or the training code. AI companies scrape source images off the Internet without permission, so they cannot use the source images to enforce OpenRAIL. And while they would own the training code, nobody is releasing training code[2], so OpenRAIL wouldn't apply there.

So I do not understand how the resulting model weights are a subject of copyright at all, given that the US has firmly rejected the concept of "sweat of the brow" as a copyrightability standard. Maybe in the EU you could claim database rights over the training set you collected. But the US refuses to enforce those either.

[0] I'm not talking about "is AI art copyrightable" - my personal argument would be that the user feeding it prompts or specifying inpainting masks is enough human involvement to make it copyrightable.

The Copyright Office's refusal to register AI-generated works has been, so far, purely limited to people trying to claim Midjourney as a coauthor. They are not looking over your work with a fine-toothed comb and rejecting any submissions that have badly-painted hands.

[1] I personally think AI training is fair use, but a court will need to decide that. Furthermore, fair use training would not include fair use for selling access to the AI or its output.

[2] The few bits of training code I can find are all licensed under OSI/FSF approved licenses or using libraries under such licenses.

nickvincent 11 days ago [-]
This is a great point.

Not a lawyer, but as I understand the most likely way this question will be answered (for practical purposes in the US) is via the ongoing lawsuits against GitHub Copilot and Stable Diffusion and Midjourney.

I personally agree the creativity is in the source images and the training code, but think that unless it is decided that for legal purposes "AI Artifacts" (the files containing model weights, embedding, etc.) are just transformations of training data and therefore content and subject to the same legal standards as content, I see a lot of value in trying to let people license training and code and models separately. And if models are just transformations of content, I expect we can adjust the norms around licensing to achieve similar outcomes (i.e., trying to balance open sharing with some degree of creator-defined use restriction).

nl 11 days ago [-]
The co-pilot and Dalle lawsuits aren't about if the training weights file can be copyrighted though (they are about if people's work can be freely used for training).

This is a different issue where the OP is arguing that the weights file is not eligible for copyright in the US. That's an interesting and separate point which I haven't really seen addressed before.

topynate 10 days ago [-]
The two issues aren't exactly the same but they do seem intimately connected. When you consider what's involved in generating a weights file, it's a mostly mechanical process. You write a model, gather some data, and then train. Maybe the design of the model is patentable, or the model/training code is copyrightable (actually, I'm pretty sure it is), but the training process itself is just the execution of a program on some data. You can argue that what that program is doing is simply compiling a collection of facts, which means you haven't created a derivative work, but in that case the weights file is a database, by definition, so not copyrightable in the US. Or you can argue that the program is a tool which you're using to create a new copyrightable work. But in that case it's probably a derivative work.
nickvincent 10 days ago [-]
Appreciate the distinction in the above comment that they are two distinct questions, but also agree the two questions are very connected.

I should've been more specific: I was thinking mainly of the artists v. stable diffusion lawsuit which makes the specific technical claim that the stable diffusion software (which includes a bunch of "weights files") includes compressed copies of the training data. (Line 17, "By training Stable Diffusion on the Training Images, Stability caused those images to be stored at and incorporated into Stable Diffusion as compressed copies", https://stablediffusionlitigation.com/pdf/00201/1-1-stable-d...).

I expect that if the decision hinges on this claim, that could have far reaching implications re: model licensing. I think this along the lines of what you've laid out here!

twoodfin 11 days ago [-]
How would you distinguish “just a mechanical derivation of training set data” from compiled binary software? The latter seems also to be a mechanical derivation from the source code, but inherits the same protections under copyright law.
kmeisthax 11 days ago [-]
Usually binaries are compiled from your own source code. If I took leaked Windows NT kernel source and compiled it myself, I wouldn't be able to claim ownership over the binaries.

Likewise if I drew my own art and used it as sample data for a completely trained-from-scractch art generator, I would own the result. The key problem is that, because AI companies are not licensing their data, there isn't any creativity that they own for them to assert copyright over. Even if AI training itself is fair use, they still own nothing.

taneq 11 days ago [-]
Do artists not own copyright on artwork which comprises other sources (eg. collage, sampled music)? It’d be hard to claim that eg. Daft Punk doesn’t own copyright on their music.

(Whether other artists can claim copyright over some recognisable sample is another question.)

kmeisthax 11 days ago [-]
This is why there's the "thin copyright" doctrine in the US. It comes up often in music cases, since a lot of pop music is trying to do the same thing. You can take a bunch of uncopyrightable elements, mix them together in a creative way, and get copyright over that. But that's a very "thin" copyright since the creativity is less.

I don't think thin copyright would apply to AI model weights, since those are trained entirely by an automated process. Hyperparameters are selected primarily for functionality and not creative merit. And the actual model architectures themselves would be the subject of patents, not copyright; since they're ideas, not expressions of an idea.

Related note: have we seen someone try to patent-troll AI yet?

nl 11 days ago [-]
It depends.

The Verve's Richard Ashcroft lost partial copyright and all royalties for "Bitter Sweet Symphony" because a sample from the Rolling Stones wasn't properly cleared: https://en.m.wikipedia.org/wiki/Bitter_Sweet_Symphony

Men at Work lost copyright over their famous "Land Down Under" because it used a tune from "Kookaburra sits in the Old Gum Tree" as an important part of the chorus.

rnd0 10 days ago [-]
>Do artists not own copyright on artwork which comprises other sources (eg. collage, sampled music)? It’d be hard to claim that eg. Daft Punk doesn’t own copyright on their music.

Agreed. By that logic, William S Burroughs wouldn't own his best novels: https://en.wikipedia.org/wiki/Cut-up_technique

taneq 11 days ago [-]
“Mechanical derivation” is doing a lot of heavy lifting here. What qualifies something as “mechanical”? Any algorithm? Or just digital algorithms? Any process entirely governed by the laws of physics?
kmeisthax 10 days ago [-]
So, in the US, the bedrock of copyrightability is creativity. The opposite would be what SCOTUS derided as the "sweat of the brow" doctrine, where merely "working hard" would give you copyright over the result. No court in the US will actually accept a sweat of the brow argument, of course, because there's Supreme Court precedent against it.

This is why you can't copyright maps[0], and why scans of public domain artwork are automatically public domain[1][2]. Because there's no creativity in them.

The courts do not oppose the use of algorithms or mechanical tools in art. If I draw something in Photoshop, I still own it. Using, say, a blur or contrast filter does not reduce the creativity of the underlying art, because there's still an artist deciding what filters to use, how to control them, et cetera.

That doesn't apply for AI training. The controls that we do have for AI are hyperparameters and training set data. Hyperparameters are not themselves creative inputs; they are selected by trial and error to get the best result. And training set data can be creative, but the specific AI we are talking about was trained purely on scraped images from the Internet, which the creator does not own. So you have a machine that is being fed no creativity, and thus will produce no creativity, so the courts will reject claims to ownership over it.

[0] Trap streets ARE copyrightable, though. This is why you'll find fake streets that don't exist on your maps sometimes.

[1] https://en.wikipedia.org/wiki/Bridgeman_Art_Library_v._Corel....

[2] Several museums continue to argue the opposite - i.e. that scanning a public domain work creates a new copyright on the scan. They even tried to harass the Wikimedia Foundation over it: https://en.wikipedia.org/wiki/National_Portrait_Gallery_and_...

cwkoss 11 days ago [-]
Is the choice of what to train upon not creative? I feel like it can be.
kmeisthax 10 days ago [-]
Possibly, but even if that were the case, it would protect NovelAI, not Stability.

The closest analogue I can think of would be copyrighting a Magic: The Gathering deck. Robert Hovden did that[0], and somehow convinced the Copyright Office to go along with it. As far as I can tell this never actually got court-tested, though. You can get a thin copyright on arrangements of other works you don't own, but a critical wrinkle in that is that an MTG deck is not merely "an arrangement of aesthetically pleasing card art". The cards are picked because of their gameplay value, specifically to min-max a particular win condition. They are not arrangements, but strategies.

Here's the thing: there is no copyright in game rules[1]. Those are ideas, which you have to patent[2]. And to the extent that an idea and an expression of that idea are inseparable, the idea part makes the whole uncopyrightable. This is known as the merger doctrine. So you can't copyright an MtG deck that would give you de-facto ownership over a particular game strategy.

So, applying that logic back to the training set, you'd only have ownership insamuch as your training set was selected for a particular artistic result, and not just "reducing the loss function" or "scoring higher on a double-blind image preference test".

As far as I'm aware, there are companies that do creatively select training set inputs; i.e. NovelAI. However, most of the "generalist" AI art generators, such as Stable Diffusion, Craiyon, or DALL-E, were trained on crawled data without much or any tweaking of the inputs[3]. A lot of them have overfit text prompts, because the people training them didn't even filter for duplicate images. You can also specifically fine-tune an existing model to achieve a particular result, which would be a creative process if you could demonstrate that you picked all the images yourself.

But all of that only applies to the training set list itself; the actual training is still noncreative. The creativity has to flow through to the trained model. There's one problem with that, though: if it turns out that AI training for art generators is not fair use, then your copyright over the model dissolves like cotton candy in water. This is because without a fair use argument, the model is just a derivative work of the training set images, and you do not own unlicensed derivative works[4].

[0] https://pluralistic.net/2021/08/14/angels-and-demons/#owning...

[1] Which is also why Cory Doctorow thinks the D&D OGL (either version) is a water sandwich that just takes away your fair use rights.

[2] WotC actually did patent specific parts of MTG, like turning cards to indicate that they've been used up that turn.

[3] I may have posted another comment in this thread claiming that training sets are kept hidden. I had a brain fart, they all pull from LAION and Common Crawl.

[4] This is also why people sell T-shirts with stolen fanart on it. The artists who drew the stolen art own nothing and cannot sue. The original creator of that art can sue, but more often than not they don't.

kaoD 11 days ago [-]
> nobody is releasing training code

Interesting. Why is this happening?

skybrian 11 days ago [-]
Fair enough. "Source available" would be better than "open source" in this case, to avoid misleading people. (You do want them to read the terms.)
daveloyall 11 days ago [-]
I'm not familiar with machine learning.

But, I'm familiar with poking around in source code repos!

I found this https://huggingface.co/openjourney/openjourney/blob/main/tex... . It's a giant binary file. A big binary blob.

(The format of the blob is python's "pickle" format: a binary serialization of an in-memory object, used to store an in-memory object and later load it, perhaps on a different machine.)

But, I did not find any source code for generating that file. Am I missing something?

Shouldn't there at least be a list of input images, etc and some script that uses them to train the model?

kmeisthax 11 days ago [-]
Hahahahaha you sweet summer child. Training code? For an art generator?!

Yeah, no. Nobody in the AI community actually provides training code. If you want to train from scratch you'll need to understand what their model architecture is, collect your own dataset, and write your own training loop.

The closest I've come across is code for training an unconditional U-Net; those just take an image and denoise/draw it. CLIP also has its own training code - though everyone just seems to use OpenAI CLIP[0]. You'll need to figure out how to write a Diffusers pipeline that lets you combine CLIP and a U-Net together, and then alter the U-Net training code to feed CLIP vectors into the model, etc. Stable Diffusion also uses a Variational Autoencoder in front of the U-Net to get higher resolution and training performance, which I've yet to figure out how to train.

The blob you are looking at is the actual model weights. For you see, AI is proprietary software's final form. Software so proprietary that not even the creators are allowed to see the source code. Because there is no source code. Just piles and piles of linear algebra, nonlinear activation functions, and calculus.

For the record, I am trying to train-from-scratch an image generator using public domain data sources[1]. It is not going well: after adding more images it seems to have gotten significantly dumber, with or without a from-scratch trained CLIP.

[0] I think Google Imagen is using BERT actually

[1] Specifically, the PD-Art-old-100 category on Wikimedia Commons.

nl 11 days ago [-]
This isn't entirely accurate.

The SD training set is available and the exact settings are described in reasonable details:

> The model is trained from scratch 550k steps at resolution 256x256 on a subset of LAION-5B filtered for explicit pornographic material, using the LAION-NSFW classifier with punsafe=0.1 and an aesthetic score >= 4.5. Then it is further trained for 850k steps at resolution 512x512 on the same dataset on images with resolution >= 512x512.

LAION-5B is available as a list of urls.

daveloyall 4 days ago [-]
Sorry, I didn't see this reply until just now.

To my eye, kmeisthax's comment appears to be entirely accurate.

Well, that is to say, that assuming the facts that listed are accurate, then I agree with the conclusion: it's not "open source" at all. (And certainly not Libre.)

The things you said do not describe an open source project.

The point here is that the title of this thing is incorrect. If the ML community doesn't agree, it's because they are (apparently) walking around with incorrect definitions of "open source" and "Free Software" and "Libre Software".

kelipso 11 days ago [-]
Have you looked at LAION-400M? And the OpenCLIP [1] people have replicated CLIP performance using LAION-400M.

[1] https://github.com/mlfoundations/open_clip

walterbell 11 days ago [-]
Thanks for educating the masses of machine-unwashed newbies!
JoshTriplett 11 days ago [-]
Yeah, this should not have a headline of "open source". Really disappointing that this isn't actually open, or even particularly close to being open.
EamonnMR 11 days ago [-]
Seems like 'the lawyers who made the license' and the OSI might be good authorities on what's open source. I'd love to hear a good FSF rant about RAIL though.
11 days ago [-]
dmm 11 days ago [-]
Are ML models even eligible for copyright protection? The code certainly but what about the trained weights?
charcircuit 11 days ago [-]
My thought is that it is a derivative work from the training data. The creativity comes from what you choose to or not to include.
nl 11 days ago [-]
Well Open Source licenses don't make sense for training artifacts for the same reason Creative Commons licenses are used for written and artists "open" works rather than Open Source.
nagonago 11 days ago [-]
> Also, you can make a carrier! How you may ask? it is easy. In our time, we have a lot of digital asset marketplaces such as NFT marketplaces that you can sell your items and make a carrier. Never underestimate the power open source software provides.

At first I thought this might be a joke site, the poorly written copy reads like a parody.

Also, as others have pointed out, this is basically just yet another Stable Diffusion checkpoint.

notpushkin 10 days ago [-]
This particular wording sounds like it could be a poor translation from Russian. Sdelat' karjeru (literally: to make a career) means to make a living doing something, or to succeed in doing some job.
88stacks 11 days ago [-]
I was about to integrate this into https://88stacks.com but it requires a write token to hugging face which makes no sense. It’s a model that you download. Why does it need write access to hugging Face!?!
bootloop 11 days ago [-]
Does it really, have you tried it or do you mean because of the documentation? Just skimmed through the code, haven't really seen anything related to uploading. Might not even be required.
vjbknjjvugi 11 days ago [-]
why does this need write permissions on my hf account?
deathtrader666 11 days ago [-]
"For using OpenJourney you have to make an account in huggingface and make a token with write permission."
admax88qqq 11 days ago [-]
But why
11 days ago [-]
KaoruAoiShiho 11 days ago [-]
How is it equivalent, it's not nearly as good. Some transparency about how close it is to MJ would be nice though, because it can still be useful.
whitten 11 days ago [-]
Maybe this is an obvious question but if you generate pictures using any of these tools, can you create the same picture/character/person with different poses, or backgrounds, such as for telling a story, and/or creating a comic book, or would you get a new picture every time, such as for the cover of a magazine ?

How reproducible would the pictures be ?

Narciss 11 days ago [-]
Yes, you can create an AI model based on a few pictures of the “model” (the model can also be AI generated) and then you can generate images of all kinds with that model included.

Check out this video from prompt muse as an example: https://youtu.be/XjObqq6we4U

haghiri 10 days ago [-]
This was my project, but since @prompthero changed their "midjourney-v4 dreambooth" model's name to openjourney, I changed my model name to "Mann-E" which is accessible here: https://huggingface.co/mann-e/mann-e_4_rev-0-1 (It's only a checkpoint and under development)
pfd1986 10 days ago [-]
Are there instructions for fine tuning the model on our own images? Thanks!
shostack 11 days ago [-]
I'm failing to train a model off of this in the Automatic1111 webui Dreambooth extension. Training on vanilla 1.5 works fine. It throws a bunch of errors I don't have in front of me on my phone.

Anyone else have similar issues? I loaded it both from a locally downloaded version of the model as well as from inputting in the huggingface path and my token with write (?!?) permissions.

Anyone run into similar issues? Suggestions?

Simon321 11 days ago [-]
Keep in mind the real Midjourney uses a completely different architecture, this is just a checkpoint for stable diffusion.
vintermann 10 days ago [-]
Who knows what Midjourney uses. We've got only claims in discords to go by.

My guess is they do internally a slightly more careful and less porn/anime oriented version of what the 4chan/protogen people do. Make lots of fine tuned checkpoints, merge them, fine tune on a selection of outputs from that, merge more, throw away most of it, try again etc. Maybe there are other models in the mix, but I wouldn't bet on it.

rks404 11 days ago [-]
noob question - how hard is it to setup and run this on a windows machine? I've had bad luck with python and package management in windows in the past but that was a long time ago.
rm999 11 days ago [-]
It's gotten much easier in the 24 hours because of this binary release of a popular stable diffusion setup+UI: https://github.com/AUTOMATIC1111/stable-diffusion-webui/rele...

(you still need a Nvidia GPU)

Extract the zip file and run the batch file. Find the cptk (checkpoint) file for a model you want. You can find openjourney here: https://huggingface.co/openjourney/openjourney/tree/main. Add it to the model directory.

Then you just need to go to a web browser and you can use the AUTOMATIC1111 webui. More information here: https://github.com/AUTOMATIC1111/stable-diffusion-webui

rks404 8 days ago [-]
oh this is so great - thanks!
andybak 11 days ago [-]
Yeah - it's a real pain (and I'm a Python dev)

I just use https://softology.pro/tutorials/tensorflow/tensorflow.htm

- A few manual steps but mainly a well tested installed that does it all for you.

rks404 11 days ago [-]
thank you, I appreciate the honesty! I checked out the guide, it looks promising and will give it a try for the next system I assemble
jpe90 11 days ago [-]
If you use the webui it's a single git clone and an optional file edit to set some CLI flags and that's it. You download models and move them to a directory to use them. Recently they introduced a binary release for people that are unfamiliar with git.
moneywoes 11 days ago [-]
Is there a solid comparison of midjourney, stable diffusion, dalle 2
moffkalast 11 days ago [-]
I've only tried out stable diffusion to any real extent, but seeing what other people have gotten out of the other two I can easily say it's the least performant of the bunch.
sdenton4 11 days ago [-]
I would be hesitant to pass judgement if only playing with one. It's easy to compare the deluge you've picked through to other people's best picked cherries...
moffkalast 10 days ago [-]
Well sure, but after hours and hours of messing with params my cherry picked best cases were still lightyears away from the average Midjourney example. Maybe I'm just bad at it ¯\_(ツ)_/¯
d3ckard 11 days ago [-]
I actually got better examples running SD on my M1 MBA than from my mid journey trial.
sourabh03agr 11 days ago [-]
Looks good but this works well only for gamey, sci-fi kind of themes. Any suggestions for prompts which can yield interesting flowcharts to explain technical concepts?
jfdi 10 days ago [-]
What is web4.0?!
indigodaddy 11 days ago [-]
Looks like I can’t use this on M1/2?
liuliu 11 days ago [-]
This is just openjourney model fine-tuned with Dreambooth. You can use any of these tools: Draw Things, Mochi Diffusion, DiffusionBee, AUTOMATIC1111 UI on M1 / M2 with this model. (I wrote Draw Things).
sophrocyne 11 days ago [-]
Hey all - InvokeAI maintainer here. A few folks mentioned us in other comments, so posting a few steps to try out this model locally.

Our Repo: https://github.com/invoke-ai/InvokeAI

You will need one of the following:

    An NVIDIA-based graphics card with 4 GB or more VRAM memory.
    An Apple computer with an M1 chip.
Installation Instructions: https://invoke-ai.github.io/InvokeAI/installation/

Download the model from Huggingface, add it through our Model Mgmt UI, and then start prompting.

Discord: https://discord.gg/invokeai-the-stable-diffusion-toolkit-102...

Also, will plug we're actively looking for people who want to contribute to our project! Hope you enjoy using the tool.

FloatArtifact 8 days ago [-]
Any chance of supporting Intel ark GPUs?
sophrocyne 8 days ago [-]
Won't say "never!" - Just seems NVidia has a stranglehold on the AI space w/ CUDA.

We're mainly waiting on others in the space (And/or increase investment by Intel/AMD) to offer support more broadly.

At this rate, I'd give Apple a likely shot of having better support than them w/ the neural engine & CoreML work they've been releasing.

d3ckard 11 days ago [-]
Out of curiosity, will M2s work out of the box?
sophrocyne 10 days ago [-]
Ought to! There are some enhancements coming down the pipe for Macs w/ CoreML, so while they won't be as fast as having a higher end NVidia, they'll continue to get performance increases, as well.
techlatest_net 11 days ago [-]
Some self promotion. We got Stable Diffusion made available as SaaS on AWS[1] with per minute pricing and the unique thing with our SaaS offering is you can shutdown/restart the SaaS environment yourself . You will get charged on per minute basis only when the environment is running.

Also, if you want to try the SaaS for free, feel free to submit a request using our contact-us form [2]

The Web interface for SD is based on InvokeAI [3]

[1] https://aws.amazon.com/marketplace/pp/prodview-qj2mhlfj7cx42 [2] https://saas.techlatest.net/contactus [3] https://github.com/invoke-ai