Besides the point the other commenter already made, I’d like to add that inference isn’t deterministic per model. There are a bunch of sources of inconsistency:
GPU hardware/software can influence the results of floating point operations
Different inference implementations can change the order of operations (and matrix operations aren’t necessarily commutative)
Different RNG implementations can change the space of possible seed images
If you generate with the same prompt and settings you get what I would consider the same image except for tiny variations (they aren’t matching pixel-perfect)
Edit: A piece of paper has a random 3D relief of fibers, so the exact position a printer ink droplet ends up at is also not deterministic, and so no two copies of a physical catalog are identical. But we would still consider them the “same” catalog
If there’s slight variation, it means it’s not the same image.
And that’s skipping over different RNG etc. You can build a machine learning model today and give it to me, tomorrow I can create a new RNG - suddenly the model can produce images it couldn’t ever produce before.
It’s very simple: the possible resulting images aren’t purely determined by the model, as you claimed.
Besides the point the other commenter already made, I’d like to add that inference isn’t deterministic per model. There are a bunch of sources of inconsistency:
If you generate with the same prompt and settings you get what I would consider the same image except for tiny variations (they aren’t matching pixel-perfect)
Edit: A piece of paper has a random 3D relief of fibers, so the exact position a printer ink droplet ends up at is also not deterministic, and so no two copies of a physical catalog are identical. But we would still consider them the “same” catalog
If there’s slight variation, it means it’s not the same image.
And that’s skipping over different RNG etc. You can build a machine learning model today and give it to me, tomorrow I can create a new RNG - suddenly the model can produce images it couldn’t ever produce before.
It’s very simple: the possible resulting images aren’t purely determined by the model, as you claimed.