When generating images, is there way to get it to understand the relative size of objects in the picture? Like you want an object A, and then you want an object B and you want Object B to x times the size of Object A. Can you go “Object B is 3 the size of Object A”? Or does it get measurements? “Object A is 1 meter long, Object B is 3 meters long.”?
It is achieved by the sum of all descriptions and the word choice. But one phrase/sentence is usually not enough. For example, you want a 2 inch fairy climbing, um, a bottle of water. The word “climb” suggests that the fairy is small enough. If you have frame/camera angle descriptions then you can add phrases like “the subject is taking one-third of the frame”, to give it the perspective relative to the table surface. I suggest that you use https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one. Find a picture in a search, then ask joy-caption to describe it in relative and superlative terms.
Thanks for the advice and linking me to that site! When you say if you have frame/camera angle descriptions can they be added when you are trying to get pictures as cartoons, anime, comics, etc instead of realistic photo images?
Sure, camera brings even an anime image to life. Just a front picture everytime is boring. Try describing how a camera is looking at the characters or the scene and you’ll notice the difference in the dynamics. A simple example,
This picture tells a story rather than showing an anime character. And it’s only 3 short sentence about the camera work, angle, focus.Cool, I hadn’t really been bothered about which direction my images were being seen from, but now I will, since it will allow other instructions for sizes, thanks!

