I have been watching Alec Soth's Youtube videos quite closely since he started the channel around a month ago. I think that he has great insight into photography. Specifically his (as current) three video series 'Pictures & Words' which has captured this feeling of how images work alongside language in a pretty nice way.
I think that to some extent music is similar, and in #3 he makes a good point about some Japanese-language photobooks. Where it's clear there was language there, it's nearly impossible (without aid) to decipher what is written, it opens the mind up to interpreting everything that could be written. There's a sort of urge to, when you see an image imagine what the photographer could have written about it. Does translating the writing fully reveal intention? In the video he uses Google Translate to scan the text and read a machine translation of the work; it's decipherable, partly, but the translation can never be exact and meaning is lost inherently. How does this affect our view of the image, with the new context perhaps the image means something different: there is a threshold which we've perhaps crossed.
Music is no different, recently there's been a surge in interest of Asian pop music. This is interesting because there's almost two distinct groups of Asian pop fans that are popular now. There's Japanese 70's/80's/90's AOR, synthpop, City pop fans, and contemporary K-pop fans. Both approach language pretty differently. You will find websites like klyrics that have full lyrics in Hangul, Romanised Hangul, and English Translation, these are pretty extensive! There's a lot of information here. Meanwhile Japanese pop fans tend to not translate lyrics, this is perhaps because of how old the music is (i.e. there is no corporation raring to sell the music to teens), either way it shows that these two groups approach the problem of non-English language songs differently. Here the comparison is between the instrumental (pictures) and the lyrics (words).
This can be shown no better than in the case of the song "Kikuo - Kara Kara no Kara" becoming popular on Tiktok. The song was used perhaps because of it's light and cheerful tone musically, but in an ironic twist the lyrics are dark and the title translates to "Emptiness, emptiness, emptiness of emptiness." Surely learning this context changes the interpretation of the song, and the videos on the platform that used it, not necessarily casting a dark light on them but revealing some information that colours a second viewing again whether it be comedic or like you're getting a joke that you wouldn't have before.
That's not to say that the average listener of non-English music doesn't pick up some of the language somewhat implicitly, there's always going to be droves of people who know what things like 'watashi wa', 'kawaii', 'doki doki', etc. mean. More that there is a clear effect of language on the 'image' of the song that you have in your head.
Further to examples of language in music, there are a lot of other similarities to photography. Both require a lot of technical skill outside of the actual 'action' of taking a photograph or playing an instrument. By this, I mean that in the case of photography knowledge is needed of hardware, how to edit photos, how to colour grade photos, how to market yourself, how to ensure that your work is backed up correctly etc. In the case of music things like mixing and mastering, different methods of saving files (flac, wav, mp3, etc.), distributing music, and so on.
These meta-skills come part and parcel with digital production but there were also a lot of similar parallels with analogue production. There was, in fact a similar time-frame of the popular switch from digital to analogue in both.
Another clear parallel is that of the commodifcation of the media. Both photography and music are seen by people as largely commodity media, with stock photo and music sites being popular for video and graphic design work. These become then almost 'feeder media' for other less-commodified work.
Finally and perhaps more abstractly both of these mediums are bound by some hard constraint that others are not necessarily: photography is limited by the size of the sensor, or film, and the length of the lens. Music is limited by both human perception and time itself. Everything in music has to be around 10-20,000Hz else it cannot be perceived (lower end perhaps can be 'felt' with specialist equipment). Music must also allocate itself in time, and can only be perceived linearly, unlike a painting through which the eye can choose any part of to look at at any time. I say that these are hard constraints, but of course work exists to challenge this, just less readily than some forms of 'fine' art which will often challenge their own boundaries.
I'm sure the title of this article is a bit of a stretch, but I think that there are a lot more interesting parallels between music and photography than have been explored popularly, at least. Music is a medium that defies explanation a lot of the time, and framing it through the existing discourse around other media might be a good way to re-orient thinking about things like this.