Today I want to tell you how my trip through Japanese lands went in the post-ChatGPT era.
It's not the first time and I know 4 little things in Japanese, very, very, very basic, I did an intensive introductory course of a year and a half, long time ago... More than a decade? And I took a year-long strike with Duolingo. In short, I'm a total gaijin, like anyone who doesn't know a thing about the language.
Of course, the landscape has changed a lot and I would say that the last time, 10 years ago, Google Lens already existed and I used it. And although Lens has served me well, it has fallen far behind compared to what a current GPT can do. Let's get started.
Moving Google Lens to the left
Google Lens does magic, you know, you take a photo and it translates the text found into the language you want. Now, however, the photo has to be good and the text more or less uniform. There are situations where this is not easy. Try photographing parts of the ingredient labels of a cylindrical bottle.
It can do the job well, but it's quite buggy. So, I tried to pass it to ChatGPT. Of course, the result is not as "wow" as replacing the text on top of the photo. But the ability to read the text there, understand the context of the photo and then translate it, are in another dimension.
I don't know if the guts of Google Lens haven't been updated in a while, and it's probably not fair to compare the Vision technologies behind Google Lens to a multimodal LLM (call it GPT or Gemini). The thing is, I have no doubt that sooner or later Gemini will eat Lens or Lens will be replaced by a Gemini and it will be an upgrade worth enjoying.
Other advantages of GPT over Lens.
Lens seems to translate with an older Google Translate version, I don't know for sure, but I do know that it is usually more appropriate to translate from Japanese to English and then take the English and translate it into Catalan or Spanish, than to do the translation in one go. This doesn't happen with an LLM SOTA, which is an extra convenience.
With the Chatbot, once you have the translation, you are in a conversation, therefore, we can go deeper into the query. Once you know the ingredients of a product, you can ask for specific information about them:
Request a search to validate the accuracy of the translation
Ask what alternative products are available.
If it's a dish on a restaurant menu, ask how to ask the waiter for a modification such as no spicy, no onion or if there is a gluten-free option.
Tourist guide
Another use that I really liked is to put a photo of a sculpture in GPT and ask it what it was. I know I don't discover fire, but it works really well as a "tour guide", here's the thing, in this case I always activated the internet search, to try to minimize hallucinations.
When we have these things in "smart" glasses it will be amazing. In short, privacy went to hell years ago.
Urgent translation
Another translation. In Japan, buses are entered from the back and exited from the front and you have to pay when you get off at a coin machine (they also have “contactless”, but it's a good way to burn scrap metal). Sure, if you're going alone, “forever-alone”, “digital nomad” or each companion manages with their own money, there's no problem, you put your money and go. But… what if you're going as a family and want to pay for the group's trip? Then you have to communicate with the driver and oh my! He won't speak English!
GPT to the rescue! I ask him how to politely say 2 adults and 1 child, so that the driver puts me on the machine when it's time to pay and that's it. What do you say, you could have learned something so basic by now, yes, but the need took me by surprise!
Oh, and be careful, don't forget to collect the change from the machine, it happened to other people in front of me and the driver started honking his horn like crazy so they could collect it again! I think a little more and call the police to return the 20 yen that was left there!
And what is this?
More image recognition. We stayed one night in a "classic" hotel, not a Ryokan, but the room was tatami and the chairs were like this:
My wife loved the meditation chair, how the hell could we get one of these? The solution in 3 steps:
Buy an IKEA chair and don't put legs on it (my idea)
My wife's hitting me for saying the wrong thing
Take a picture of the chair and ask the GPT what that chair is called
Pay for the chair T_T (step 3.1)
Fun things
But it's not all about problem solving and knowledge, GPT also enriches the fun of the journey.
We went to the Golden Temple, for the umpteenth time (“bored!”… which is beautiful, but it is SO typical and touristy…) collecting photos from the internet you would be able to create a perfect 3D model just by putting together the photos that have been taken of this temple, without the need for any AI. My photo as there are 3000:
Of course, knowing that it's such a trite place, I know that the GPT is known by heart so I say to the little one. "Shall we play a joke on grandma?". I take a freshly taken photo of the temple and ask the GPT, who had recently obtained Ghibli powers (“level 2” of GPT photo editing). "Take the photo as if it were winter and snowy". He nailed it, total photorealism. My son freaked out, although it wasn't "difficult" knowing how the technology works.
We send it to our grandmother (who is in Catalonia): “Wow, it suddenly snowed, look at what a beautiful photo.” And she eats it with potatoes hahaha, she has no idea what the weather is like in Japan at that time (nor would I if they did the same to me).
Then the good stuff begins. The little boy hits him on the head. “What if we put some snowmen?” A great idea, we ask the GPT and “voila”, some perfect snowmen.
We send it to grandma… “Wow! How cute… and did you make them?” (the smell of the stuffed animals is already starting to waft, they are such perfect dolls and how come we have carrots and sticks and balls???)
Then we're going to make the ultimate joke, that is, one that would show our true colors. "What if we make it night?" And the answer couldn't be better:
“Yes, yes!, and let there be snowflakes and fireworks!” We asked the GPT that with each iteration the photo would become more “drawing” and less “photorealistic” and we would end up with a totally implausible photo of it snowing at night, with fireworks and our perfect snowmen.
And no, the last photo didn't leak, it was the idea, and what a great time we had.
Essential vocabulary
The last use case is a typical one, preparing common vocabulary or phrases that you think you might need. But I didn't decide on them, my husband did, so after I took the typical ones: "where is the toilet", "how much does this cost" and a series of boring things, my son prepared the following:
20 insect names
Butterfly -> Chou
Beetle -> Kabutomushi
Grasshopper -> Batta
…
20 fish names
Sardine -> Owashi
Mackerel -> Saba
Sole -> Hirame
…
Very important phrases
Is there a free buffet? -> Open buffet wa arimasu ka
Is the food varied? ->Tabemono wa tayou desu ka
Are there pizzas on the stone? -> Ishigama-yaki pizza wa arimasu ka
And that's it, this article that seemed typical, but I hope it surprised you, inspired you to try unusual uses of AI and if it made you smile, I'll have happiness for a week.
If you enjoyed it, share it with more people, and if not, share it by criticizing it haha.
Until next time!
The header photo is a mess, that's me "in quotes", but the original is a photo with a Samsung AI filter that makes you look like a Purikura that exaggerates the pupils and then passed through GPT to make an outpaint so that it was panoramic, so if the original was already implausible, this is a total hallucination, but it's cool.