Apple opts everyone into having their Photos analyzed by AI

Zerush@lemmy.ml · 2 days ago

Apple opts everyone into having their Photos analyzed by AI

Boomkop3@reddthat.com · edit-2 2 days ago

It’s a cool idea: certain approaches to encryption still allow math to be performed. Here’s one example: say you encrypt data X with algorithm Z. then you could multiply Z by four, which would also multiply X by four. So you can run computations on the encrypted data without decrypting it.

It would be quite complex, but I suppose you could run a machine learning model this way to tag images without ever seeing the image, or knowing the resulting tag. Only the decryption key can be used read the results (which is on the user’s iphone, I suppose).

However… I don’t know how much compute cost this adds to an already expensive computation. The encryption used might not be the strongest out there. But the idea is pretty cool!

reddig33@lemmy.world · 2 days ago

I don’t really understand the purpose of the feature — GPS tags are already embedded in the photo by the phone, so it knows the location of each picture. The phone also analyzes faces of people you’ve identified so you can search for people you know. What else does this new feature add?

Boomkop3@reddthat.com · 2 days ago

It let’s you type “eiffel tower” into search and get those pictures. Rather than all the other unspeakable things you did in Paris that night

Grandwolf319@sh.itjust.works · 1 day ago

So I recently installed Immich and it does it for me using local AI

Boomkop3@reddthat.com · 1 day ago

Yep, machine learning is nice

reddig33@lemmy.world · 2 days ago

Current implementation seems like overkill. Why not just:

Search “Eiffel tower”
send search term to Apple server that already exists (Apple Maps)
server returns gps coordinates for that term
photos app displays photos in order of nearest to those coordinates

Boomkop3@reddthat.com · edit-2 2 days ago

Because you took two selfies in a restaurant near there, made a huge stunning collage of a duck below the tower and a couple photos from a while away to get the whole tower in view.

I’m running this tech at home, because we had the same use case. Except for me it’s running on a nas, not Apple’s servers. The location solution doesn’t quite work as well when you’re avid photographer

Petter1@lemm.ee · 1 day ago

If you read the article, you would know that the hard work is done locally on your iPhone not on apples server.

Boomkop3@reddthat.com · 1 day ago

If you read the article thoroughly you’d know that a smaller model runs locally, to get an guess that a landmark might be in a spot in the image. The actual identification and tagging is done in the cloud. The tag is then sent back.

AwkwardLookMonkeyPuppet@lemmy.world · 2 days ago

Because then they don’t have an excuse to move all your data to Apple servers and scan it for later use.

utopiah@lemmy.ml · 2 days ago

I don’t know how much compute cost this adds to an already expensive computation.

At that scale and because they do pay for servers I bet they did the math and are constantly optimizing the process as they own the entire stack. They might have somebody who worked on the M4 architecture give them hint on how to do so. Just speculating here but arguably they are in a good position to make this quite efficient, even though in fine if it’s actually worth the ecological costs is arguable.

queermunist she/her@lemmy.ml · 2 days ago

I bet they did the math

Did they? Because it seems like everyone else is in a hype bubble and doesn’t give a shit about how much this costs or how much money it makes.

utopiah@lemmy.ml · edit-2 2 days ago

Looks like they did “Brakerski-Fan-Vercauteren (BFV) HE scheme, which supports homomorphic operations that are well suited for computation (such as dot products or cosine similarity) on embedding vectors that are common to ML workflows” namely they use a scheme that is both secure and efficient specifically for the kind of compute they do here. https://machinelearning.apple.com/research/homomorphic-encryption

someacnt@sh.itjust.works · 2 days ago

At least it’s not going to be the overhyped LLM doing the analysis, it seems, considering the input is a photo data.

utopiah@lemmy.ml · 2 days ago

not going to be the overhyped LLM doing the analysis

Here indeed I don’t think so but other vision models, e.g. https://github.com/vikhyat/moondream are relying on LLM to generate the resulting description.

someacnt@sh.itjust.works · 1 day ago

My gosh, what is with people’s reliance on single thing

utopiah@lemmy.ml · 1 day ago

Well to be fair, and even though I did spend a bit of time to write about the broader AI hype BS cycle https://fabien.benetou.fr/Analysis/AgainstPoorArtificialIntelligencePractices LLMs are in itself not “bad”. It’s an interesting idea to rely on our ability to produce and use languages to describe a lot of useful things around us. So using statistics on it to try to match is actually pretty smart. Now… there are so many things that went badly for the last few years I won’t even start (cf link) but the concept per se, makes sense to rely on it sometimes.

Boomkop3@reddthat.com · edit-2 2 days ago

Their chips are pretty good at not drawing much power. But then you also get to the balance of power cost, computing power and physical space.

Google and Microsoft are already building their own power generation systems for even faster AI slop. That would make power a lot cheaper, and super efficient chips might not be the best answer.

I don’t know which way Apple will go, except further up their own behind. But either way, these are some really cool approaches to implementing this technology, and I hope they keep it up!

utopiah@lemmy.ml · 2 days ago

Yep, reading their blog post to read a bit better. I don’t like that it’s enabled by default, especially despite iCloud off (which should be a signal to say the user does NOT want data leaving their device) but considering what others are doing, this seems like the best trade off.

ClamDrinker@lemmy.world · edit-2 1 day ago

deleted by creator

Boomkop3@reddthat.com · 24 hours ago

The end user can access the resulting tags, Apple cannot. However iphones do automatically report if they see something Apple does not like (in the usa).
Whatever lack of incentives may be, this is what is happening. I just explained it a bit simpler than the article did.