NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Scientists working to decode birdsong (newyorker.com)
wormlord 1 days ago [-]
I am surprised we don't see more "Dr. Doolittle" projects like this. I assumed rats or corvids would be good candidates for animal language translation projects since you can keep them in a confined space and record video of them. I am sure that body language plays a huge role in animal communication.

I recently read a paper[0] that claimed to have decoded the basic building blocks of Sperm whale language. I went and took a look at the github for the project CETI and found that most of the code was for the whale trackers and hydrophones. It seems like there are a lot of pre-requisite problems that you have to solve to even get good whale recordings.

On the other hand though, whales probably don't rely on body language since they are communicating way out of line of sight. So it may be easier in that regard.

Anyways, I am convinced that we will figure out how to teach some basic human concepts (like self) to animals and the intelligence of even "stupid" animals like chickens will make people more reluctant to eat meat.

[0] https://www.nature.com/articles/s41467-024-47221-8#:~:text=S....

jampekka 1 days ago [-]
A major reason was that there was a dogma that language is uniquely human phenomenon. It was already part of the hugely Decartes' philosophy of mind and got a strong revival with Chomskian linguistics and the "cognitive revolution" in the late 1950's.

There was sporadaric research into animal linguistics, e.g. the Koko study, but those were dismissed on the grounds that animals can't have language by definition.

The ethics of enslaving and torturing animals is definitely part of the motivation for the dogma.

Fripplebubby 1 days ago [-]
To be clear - as of today, many researchers would agree that language is still a uniquely human phenomenon. They discuss this pretty explicitly in the article linked, how it is important to draw a distinction between language and communication. There are no non-human species that have been found to use language for the Chomskian definition of language (using a finite set of symbols to represent an infinite number of communicable meanings).

However, this "dogma" as you call it is beginning to be weakened as researchers document more nuance and complexity in non-human communication than ever before, and so some researchers begin to say, "maybe we shouldn't have this all-or-nothing view of language". But it is simply not true that researchers are suppressing evidence of language in animals out of a desire to enslave and torture them.

jampekka 1 days ago [-]
> There are no non-human species that have been found to use language for the Chomskian definition of language (using a finite set of symbols to represent an infinite number of communicable meanings)

It's far from clear whether humans are capable of the Chomskian criteria of language. And Chomskian linguistics have more or less collapsed with the huge success of statistical methods.

earthboundkid 1 days ago [-]
Chomsky's poverty of stimulus argument is, if anything, strengthened by LLMs. You need to read the entire internet to make statistical methods work at producing grammatical texts. Children don't read the entire internet but do produce grammatical texts. Therefore &c. QED.
OneManyNone 21 hours ago [-]
I think this is greatly complicated by the fact that the human brain has been "pre-trained" (in the deep learning sense) by hundreds of millions of years of evolution.

A pre-trained LLM also can also learn new concepts from extremely few examples. Humans may still be much smarter but I think there's a lot of reason to believe that the mechanics are similar.

jampekka 9 hours ago [-]
The poverty of the stimulus (POS) argument is that "evolutionary pre-training" in the form (recursive) grammar is fundamentally required and can not be inferred from the stimulus.

The argument is based on multiple questionable assumptions of Chomskian linguistics:

- Humans actually learn grammar in the Chomskian way - Syntax is separate from semantics, so only language (utterances) can be learned from uttrances, and not e.g. what is seen in the environment - At least in the Gold's formalization of the argument language is learned only from "positive examples", so e.g. the learner can't observe that some does not understand some utterance

One could argue for a (very) weak form of POS that there has to be some kind of "inductive bias" in the learning system, but this applies to all learning as shown by Kant. The inductive bias can be very generic.

foldr 4 hours ago [-]
>At least in the Gold's formalization

It seems to be a persistent myth (possibly revived more recently due to Norvig?) that Chomsky's POS argument has some interesting connection to Gold's theorem. The two things have only a very loose logical connection (Gold's theorem is in no sense a formalization of any claim of Chomsky's), and Chomsky himself never based any of his arguments for innateness on Gold's theorem. Here is a secondary source making the same point (search for 'Gold'): https://stevenpinker.com/files/pinker/files/jcl_macwhinney_c...

The assumption that syntax is 'separate from semantics' also does not figure in any of Chomsky's POS arguments. Chomsky argued that syntax was separate from semantics only in the fairly uncontroversial sense that there are properly syntactic primitives (e.g. 'noun', 'chain', 'c-command') that do not reduce entirely to semantic or phonological notions. But even if that were untrue, it would not undermine POS arguments, which for the most part can be run without any specific assumptions about the syntax/semantics boundary. Indeed, semantic and conceptual knowledge provides an equally fertile source of POS problems.

jjk7 1 days ago [-]
Children do get ~6000 hours a year of stimulus. Spoken, unspoken, written, and body language. Even then they aren't able to form language proficiently until 5 or 6 years old. Does the internet contain 30,000 hours of stimulus?
feoren 1 days ago [-]
30,000 hours is about the amount of new video uploaded to YouTube every hour.
jjk7 1 days ago [-]
That's astonishing. If you watched all of them, how much new information would you learn? I suspect a large portion of them are the same information presented differently; for example a news story duplicated by hundreds of different channels.
Tostino 24 hours ago [-]
It's a huge amount of video-game footage included in those "hours of video uploaded per-hour".

So very, very little new info will be conveyed by the vast majority of the content.

Affric 21 hours ago [-]
Yeah, I imagine every moment of communication a child receives is new information not just baby talk about getting the spoon in their mouth and asking them if they have pooped.
culi 1 days ago [-]
> Does the internet contain 30,000 hours of stimulus?

Is this a joke?

jjk7 1 days ago [-]
I'm sure someone else could calculate the informational density of all of the text on the internet vs. 30,000 hours of sight, smell, touch, sound, etc density. My intuition tells me it's not even close.
feoren 1 days ago [-]
Does the information contained in smell and touch contribute to the acquisition of language? Keep in mind you'd be arguing that people born without a sense of smell take longer to develop language, or are otherwise deficient in it in some way. I'm doubtful. It's certainly tricky to measure full sight / sound vs. text, but luckily we don't have to, because we also have video online, which, surprise surprise, utterly dwarfs 30,000 hours of sight and sound in terms of total information.
c22 18 hours ago [-]
One qualitative difference is that the child's 30,000 hours is realtime, interactive, and often bespoke to the individual and context. All the videos on youtube are static and impersonal.
culi 1 days ago [-]
I agree its not even close! A single day of YouTube uploads alone is 720,000 hours!
snapcaster 23 hours ago [-]
I think what he's saying is that "real world" interaction is so high bandwidth it dwarfs internet (screen based) stimulation. Not saying I agree just that he's not comparing hours being alive to hours of youtube
thaumasiotes 16 hours ago [-]
> Even then they aren't able to form language proficiently until 5 or 6 years old.

Not even close; they're already proficient at two.

culi 1 days ago [-]
> And Chomskian linguistics have more or less collapsed with the huge success of statistical methods.

People have been saying this for decades. But the hype around large language models is finally starting to wane and I wouldn't be surprised if in another 10 years we hear again that we "finally disproved generative linguistics" (again?)

Also, how many R's are in "racecar"?

OneManyNone 21 hours ago [-]
Counterpoint: What progress has generative linguistics made in the same amount of time that deep learning has been around? It sure doesn't seem to be working well.

Also, the racecar example is because of tokenization in LLMs - they don't actually see the raw letters of the text they read. It would be like me asking you to read this sentence in your head and then tell me which syllable would have the lowest pitch when spoken aloud. Maybe you could do it, but it would take effort because it doesn't align with the way you're interpreting the input.

jampekka 9 hours ago [-]
Also being able to count number of letters of a word is not required for language capability in the Chomskian sense at least.
foldr 4 hours ago [-]
>What progress has generative linguistics made in the same amount of time that deep learning has been around? It sure doesn't seem to be working well.

Working well for what? Generative linguistics has certainly made progress in the past couple of decades, but it's not trying to solve engineering problems. If you think that generative linguistics and deep learning models are somehow competitors, you've probably misunderstood the former.

circlefavshape 1 days ago [-]
> using a finite set of symbols to represent an infinite number of communicable meanings

This always seemed wildly implausible to me. A very large number of communicable meanings, sure, but infinite?

Majromax 1 days ago [-]
> This always seemed wildly implausible to me. A very large number of communicable meanings, sure, but infinite?

This is "trivial" in the boring kind of way. With just digits, we can communicate an infinite set of distinct numbers simply by counting.

marcosdumay 23 hours ago [-]
We can't really communicate an infinite amount of numbers. People just can't read or remember too many digits.
Affric 21 hours ago [-]
We can. Scientific notation with 1 significant figure can be meaningful because we can use it to figure out order relations. It’s an infinite language.
conradev 1 days ago [-]
David Deutsch claims in “The Beginning of Infinity” this is a property called universality, and that we have it. A short excerpt:

https://www.lesswrong.com/posts/HDyePg6oySYQ9hY4i/david-deut...

The whole book is worth reading, though, as it lays it out in more detail.

gyomu 1 days ago [-]
Seems trivially demonstrable because you can just chain things forever?

Mary ran after the dog and the dog was brown and a cat came along and…

MourYother 1 days ago [-]
> you can just chain thing forever

I think you're going to find out that no, you can't, and this impossibility is going to trivially demonstrate itself.

bornfreddy 1 days ago [-]
Recite 99 bottles of beer on the wall, but start from 1 and change so the number increases? Stop when there are no remaining numbers or when you reach infinity, whichever comes first.
marcosdumay 23 hours ago [-]
So, is this a proposal to test how long it takes for you to lose your count?
nick__m 22 hours ago [-]
They are talking as if language was some platonic construct like a Turing machine with an infinite tape and you are talking about the concrete reality where there are no such things as an infinite tape.

Both viewpoints are useful, they can prove general properties that hold for arbitrary long sequence of words and you put a practical bound on that length.

marcosdumay 15 hours ago [-]
The question is if human are capable of infinitely extensible language.

That's clearly false. It's not about some platonic mathematical simplification. Humans patently do not fit the Chomsky criterium for intelligence.

In fact, I'm pretty sure it's physically impossible for any real being to fit it.

1 days ago [-]
snapcaster 23 hours ago [-]
Can you say more? English doesn't have any cap on sentence length I think i'm missing your point
MourYother 19 hours ago [-]
> English doesn't have any cap on sentence length

Well, yes and no. Constructing this "infinite" sentence will run into some serious problems once the last star burns out, possibly sooner.

Teever 10 hours ago [-]
"I have a truly marvelous demonstration of this proposition which this margin is too narrow to contain."
bryanrasmussen 1 days ago [-]
Since English has several possible sentences that are infinite in length, made up of only one word even https://medium.com/luminasticity/grammatical-infinities-what... I have to agree with all the this is trivial comments.
TeMPOraL 1 days ago [-]
Whatever "finite set of symbols" humans use to communicate is not the finite set of symbols that form letters or words. Communication isn't discrete in practical sense, it's continuous - any symbol can take not just different meanings, but different shades and superposition of meanings, based on the differences in way it's articulated (tone, style of writing - including colors), context in which it shows, and context of the whole situation.

The only way you can represent this symbolically is in the trivial sense like you can represent everything, because you can use few symbols to build up natural numbers, and then you can use those numbers to approximate everything else. But I doubt it's what Chomsky had in mind.

mmooss 18 hours ago [-]
> "maybe we shouldn't have this all-or-nothing view of language"

That idea seems like a strawperson, not something anyone seriously thinking about it would say. Everyone sees animals communicate.

Fripplebubby 4 hours ago [-]
It's not a question of communication, it's a question of language.
its_bbq 19 hours ago [-]
This is not really the point though. Why is "uses Chomskian language" the criteria for whether it not it's okay to to change and slaughter a living being?
Fripplebubby 4 hours ago [-]
There is and remains a desire to explain exactly how it is that humans are different than other animals. Language or the language faculty has been touted by some as this thing.
melagonster 17 hours ago [-]
I do not know anyone did this. I don't think biologist care about this.
6510 19 hours ago [-]
Those researchers are just making noises. It doesn't mean anything.
netdevnet 1 days ago [-]
Your usage of "language" here is akin to laymen usage of "hypothesis" and "theory" and then trying to apply it in an academic context. Same sequence of letters but different meaning. In linguistics, "language" has a specific definition that only humans have been shown to have. Some trained individuals like Koko do seem to demostrate an very limited ability to use "language" in the linguistics sense.

You might argue that the definition itself is arbitrary and coming from the same place that geocentrism, creationism and flat-Earth views come from. I can't argue for or against that.

I suspect things as more nuanced than the current definition that we have though, especially after the recent study from the Scientific American that heated up Hacker News in a way that only "Is CS a science" articles can.

jampekka 1 days ago [-]
There's no consensus on the definition of what language is.

Chomskian linguistics does posit that human language is based on (innate) recursive grammars (narrow language faculty hypothesis), but this has always been a contentious question. And per that definition humans too have demonstrated only very limited ability in e.g. infinite embedding.

throwawaymaths 1 days ago [-]
My dog can push buttons to let me know what he wants. Those buttons speak in English. Is that language?
IIAOPSW 21 hours ago [-]
"Language" in the sense of "the thing only humans have been shown to do" requires a bit more than just one to one correlations between signifiers and objects (or a "sentence" of signifiers with the same meaning as all of the words added together independently). For a system of symbols to be "language" there must be a difference between "what the cat ate" and "what ate the cat". No animal communication has been shown to have a grammar to it, and thus the ability to express exponentially many unique ideas with each additional word.
throwawaymaths 20 hours ago [-]
I feel like there are human languages where the symbolic distinction between "what the cat ate" and "what ate the cat" are nil and the understanding is achieved contextually.
throwup238 1 days ago [-]
I think it only counts if he can express that he wants you to urinate on the same fire hydrant after he does.

That’s the minimum level of complexity science will accept.

IggleSniggle 1 days ago [-]
I think most dog owners would tell you that their adult dogs can communicate things like this, but that the language is unfortunately siloed into a very personal relationship that is difficult for even the human part of the pair to demonstrate, making it difficult to do science about
wormlord 1 days ago [-]
Sometimes at bedtime my cat will go to the door and scream nonstop. I don't know why he does it. Maybe it is for food or attention. But the only way I have found to get him to stop is to pick him up, put him on his special pillow, squish him, and have my partner join me in telling him "we are going to bed, it's bedtime".

I'd say about 80% of the time he listens. So he is capable of understanding what we want him to do, and capable of supressing his own personal desires in order to maintain harmony in our group. Funny enough, he won't go to bed unless both me and my partner tell him it is bedtime, so maybe he is only obeying because there is some majority consensus?

Because of this, I find it easy to believe that a cat or dog could be taught something as abstract as "self" if they can understand commands and intent and group dynamics. It's just difficult to tell what is "understood" and what is just conditioned behavior. Hell, I can't even answer that question for myself as a human.

jerf 1 days ago [-]
However, another major reason is that people have repeatedly gone seeking for language-like or human-language-level behaviors in animals, and repeatedly and consistently failed.

It is also worth pointing out that detecting language is a great deal easier than understanding language. Something like https://www.youtube.com/watch?v=vvr9AMWEU-c is reasonably recognizable as clearly some sort of language even if we have no (unassisted) human idea what it is saying. We can tell with quite high confidence that most animal sounds are not hiding some deeper layer of information content.

Such exceptions as there are, like whalesong, take you back to my first paragraph, though.

The idea that language is a uniquely human phenomenon may be "dogma", but it is also fairly well-founded in fact. It should also not be that surprising; had another species developed language first, they'd be the ones looking around at their surroundings being surprised they are the only ones with proper language, because they'd probably be the dominant species on the planet. It isn't a "humanist" bias, in some sense that humans are super special because they're humans, it's a "first species to high language" bias, which happens on this planet to be humans.

__MatrixMan__ 1 days ago [-]
There still is, as far as I can tell. Whenever my curiosity drives me to take a psychology or philosophy class I end up with the feeling that they think part of their job is to reassure the rest of the humans that we are in fact special. It feels like some kind of leftover from when that kind of work was done by monks.
goatlover 1 days ago [-]
We are objectively special in creating technological civilization with all sorts of cultural artifacts like philosophy that we have no evidence for in other species that have existed on this planet, other than possibly a few of our close hominid relatives. Hominids are a very special evolutionary branch in that sense.

When we think about ETs, we're wondering about technological civilizations on other planets with space craft and radio telescopes, not the equivalent of birds or dolphins.

bongodongobob 1 days ago [-]
Really? I distinctly remember a lot of pissed off kids in my college philosophy and psychology classes trying to defend their religious beliefs and that we are more than just monkeys with fancier tools. Most of the religious folks (at least vocally) dropped out of Philosophy 101 after 2 weeks. It was incredibly entertaining. I guess this was 20 years ago, but assuming we are a more secular society I guess I thought that would still be the case.
__MatrixMan__ 21 hours ago [-]
It's a bit different at the intro level. I'm talking about the professors and the grad students. It's not that they're directly religious, but I get a status-quo-preserving kind of feeling from them. Like maybe they're influenced by a tradition of not calling your patron an ape--or somesuch.
bongodongobob 20 hours ago [-]
Weird, I would think it would be much less hand-holdy at that level.
__MatrixMan__ 3 hours ago [-]
It feels more like gatekeeping: let nobody who thinks animals talk to each other call themselves a psychologist.
marcosdumay 23 hours ago [-]
Hum... So, you are in full agreement with the GP?
mmooss 18 hours ago [-]
> those were dismissed on the grounds that animals can't have language by definition.

What modern person said that? I've read a bit about it and didn't encounter that.

> The ethics of enslaving and torturing animals is definitely part of the motivation for the dogma.

What does that have to do with language?

throwawaymaths 1 days ago [-]
Dr Doolittle on corvids would be easy.

1. Train crows to push a touchscreen for reward of food.

2. Next set up two touchscreens back to back. Make it so touching one screen only dispenses food on the other side.

3. Next make it so food is dispensed on the other side only one crow is perched at each terminal.

4. Next make it so food is only dispensed after a crow says something to the other crow on the other side.

5. Next display a picture on one terminal and give the other crow the choice of four quadrants. The food is dispensed if the picture on the far side matches the displayed picture.

6. Start decoding words.

dekhn 23 hours ago [-]
Not exactly your protocol, but I'm reminded of https://www.smithsonianmag.com/smart-news/scientists-taught-...
wormlord 1 days ago [-]
This could be done cheaply with some rooted Kindle Fire tablets. I don't follow 100% but it sounds cool.
lacker 1 days ago [-]
Hey, that makes a lot of sense. There's a lot of crows who come to the bird feeder in my backyard. I would just have to figure out how to easily make a food dispenser, and what sort of touch screen a crow can activate....
throwawaymaths 1 days ago [-]
I would do it myself but where I am there are almost no corvids, they (and city pigeons) are basically displaced by the highly aggressive local bird. Which seems to swarm around from time to time but doesn't stay anywhere permanent locally except for grocery store parking lots. That's probably enough to leak location info on this anon account so I'll shut up now
appplication 13 hours ago [-]
I think the thing about animal language is there is no objective truth to it. Even with well defined words and educated and experienced communicators, miscommunication and ambiguity is everywhere in human language.

For animals I think there is less language as we know it. Communication is an instantaneous and unfiltered expression of mental state, and there isn’t much guarantee that it’s received as intended. My dog is the friendliest little dude. He will play bow other dogs immediately, and then jump around with excitement. But it’s extremely common that other dogs misread this and respond with aggression.

I suppose I think animal language is less something we could ever learn academically and more something you just feel.

pvaldes 1 days ago [-]
> I am surprised we don't see more "Dr. Doolittle" projects like this

Robots asking millions of birds to attack any hooman at sight in its own bird language, AKA the Tippi Hedren project. Seems pretty fly.

speed_spread 1 days ago [-]
More likely, instructing birds to fly in Nike Swoosh formation (or some other logo) on command for cheap low altitude sky adverts.
hotspot_one 1 days ago [-]
And this is why capitalism will always win over communism :)
forgotoldacc 17 hours ago [-]
I think one problem with the idea of learning the language of captive animals is that it runs on the assumption that animals have an inherent language. There was some research centuries ago that involved raising kids without exposure to the outside world of languages in hopes of learning what our "natural" language was. Turns out there isn't one.

And I think the same applies to animals. Bird songs have already been noted to have accents by region. We'll probably end up finding out that if animals do have complex language, it's something they develop through community exposure just like humans do, and it'll vary by area. Crows in Texas will talk very different from crows in England, I'm sure.

With pets, most of them are kept isolated from the world at large. It wouldn't surprise me if dogs raised by people have very stilted or nonexistent language compared to feral dogs. Maybe 200 years from now, we'll see the modern concept of raising dogs completely alone as inhumane and sending them off to doggy daycare a couple times a week to learn the local dog language and "culture" will be normalized.

LinuxBender 1 days ago [-]
Will there be a place I can upload bird recordings? I have half a dozen wild grouse that think they are my chickens and they have dozens of different sounds they babble at me and I have no idea what they are trying to convey. I try to mimic the sounds they make. Sometimes they chat back and forth with me until I get bored, sometimes they follow me whereas one particular sound makes them wander off.
monknomo 1 days ago [-]
The Cornell Bird Lab via the Macauley Library accepts citizen science recordings

https://support.ebird.org/en/support/solutions/folders/48000...

I think unusual bird behavior recordings are appreciated by scientists

LinuxBender 1 days ago [-]
Thankyou for that. I will check it out! Maybe one day the decoding project can ingest all the sound content.

Reading through the rules I like these people already. They prefer high quality .wav files as do I. Not sure if I have the skills to edit to their standard but I will try.

joshvm 1 days ago [-]
Have a look at xeno-canto as well, a large repo of animal sounds. It's more of a general archive than specifically for “understanding”, for example it’s often used to train audio recognition models.

https://xeno-canto.org/

ainiriand 1 days ago [-]
I'm sorry but that's just adorable.
LinuxBender 1 days ago [-]
No need to apologize. Many of the animals here are fun to interact with. Maybe this upcoming winter I will try to record the deer when it's feeding time. There's usually 2 or 3 fawn that are right on my heels testing each food pile to see which one is the right one not realizing they should just start with the first one to get more time to eat.
nanna 1 days ago [-]
May I use this opportunity to alert you to the excellent bird identification app by Cornell University, Merlin.

https://merlin.allaboutbirds.org/

wileydragonfly 1 days ago [-]
Great app for playing bird songs and annoying them once you’ve identified them, too. Sometimes you can get a few chirping really loudly at you and confused why their new friend looks like an iPhone.
yunohn 1 days ago [-]
+1, this app is an eye opener to the nature around oneself. So much so, I have actually linked it to my iPhones action button to make it easier to open on a whim.
nanna 23 hours ago [-]
Good idea!
userabchn 1 days ago [-]
I installed it a year or two ago but was disappointed by its identification abilities. Then it changed to require providing an email address so I deleted it.
nanna 23 hours ago [-]
Can confirm that it doesn't require an email these days. You can create an account and upload your recordings but otherwise no account needed

I've been more than happy with it's id abilities.

Maybe give it another try?

joshvm 1 days ago [-]
You might have used the wrong model. They tend to be location specific, so if you live in eg Australia make sure you get the appropriate pack. It does skew to more common species - there is a very long tail in species recognition.
drilbo 18 hours ago [-]
I didn't immediately see a way around the "Please enter your email" prompt, but long pressing the icon (on android) gives context menu with options like "Choose photo" and "Start new recording" that open into the main app without any login.
frereubu 1 days ago [-]
> “Social birds . . . are constantly chatting to each other,” Mike Webster, an animal-communication expert at Cornell, says. “What in the hell are they saying?”

Whenever I hear this question I always remember the Eddie Izzard skit about birdsong being territorial, so the nightingale in "A Nightingale Sang in Berkeley Square" was essentially shouting "Get out of Berkeley Square! It's my Square!"

cubefox 1 days ago [-]
worble 1 days ago [-]
Haha same, and this Mitchell and Webb skit as well which parodies the same thing https://www.youtube.com/watch?v=I9A5y6mXMh8
dilawar 1 days ago [-]
Dogs in my street barking at night are totally saying similar things for half an hour.
11235813213455 1 days ago [-]
While I'm totally fine with birds sounds, dog barks are so annoying, almost as much as motorbikes
infruset 1 days ago [-]
Does anyone have a clue how far we are from having "LLMs for animals"? Even if we don't understand what the LLM is saying to a dolphin or a monkey, does it change much from feeding millions of texts to a model without ever explaining language to it as a prerequisite?
jampekka 1 days ago [-]
A predictive/generative model of animal "vocalizations" would be almost trivial to do with current speech or music generation models. And those could be conditioned with contextual information easily.
joshvm 1 days ago [-]
Generative models yes, since there are terabytes of audio available. High quality contextual info is much harder to obtain. It’s like saying that we could easily build a model for X if we had training data available.

With LLMs we can leverage human insight to e.g. caption or describe images (which was what made CLIP and successors possible). With animals we often have no idea beyond a location. There is work to include kinematic data with audio to try and associate movement with vocalisation but it’s early days.

https://cloud.google.com/blog/transform/can-generative-ai-he...

velcrovan 1 days ago [-]
Wouldn't we need several hundred gigabytes of ingestible/structured contextual info for animal vocalizations in order to train a model with any accuracy? Even if we had it, seems to me the model would be able to tell us what sounds probably “should” follow those of a given recording, but not what they mean.
jampekka 1 days ago [-]
lossolo 1 days ago [-]
We could train a transformer that could predict the next token, whether it's the next sound from one animal or a sound from another animal replying to it. However, we wouldn't understand the majority of what it means, except for the most obvious sounds that we could derive from context and observation of behavior. This wouldn't result in a ChatGPT-like interface, as it is impossible for us to translate most of these sounds into a meaningful conversation with animals.
visarga 1 days ago [-]
Why not label a fine-tuning dataset with human descriptions based on video recordings. We explain in human language what they do, and then tune the model. It doesn't need to be a very large dataset, but it would allow for models to directly translate to human language from bird calls.
amelius 1 days ago [-]
But then it's not a translation of the bird tweets, but more like a predictive mapping from tweets to behaviors.
lossolo 19 hours ago [-]
What if they just sit and talk? What is the description of this? What if only part of the communication is relevant? What if it's not relevant at all because they reacted to atmospheric changes? Or electromagnetic signals, that can't be observed on video? Or smell? Or sound outside of human hearing frequency? What if the decision based on communication is deferred? etc etc

As I mentioned before, only the most obvious examples of behaviors and context can be translated into anything meaningful.

goatlover 1 days ago [-]
Reminds me of Wittgenstein's if a lion could speak, we would not understand it.
4gotunameagain 1 days ago [-]
It's "almost trivial" and "easily" done, I only wonder why we aren't speaking to animals already.

Oh wait. Because the devil's in the details, the ones SW dev hubris glosses over ;) ;)

jampekka 1 days ago [-]
To clarify: I didn't mean a model that would "translate" animal sounds to some representation of language or meaning. I meant a model that would capture statistical regularities in animal sounds and perhaps be able to link these to contextual information (e.g. time of day, other animals around, season etc).

By almost trivial I mean it wouldn't require much new technology. Something like WaveNet or VQ-VAE could be applied almost out of the box.

Data availability is may be a significant problem, but there are some huge animal sound datasets. E.g. https://blog.google/intl/en-au/company-news/technology/a2o-s...

joshvm 1 days ago [-]
Someone already mentioned Aza Raskin, but the organisation you should look up is Earth Species Project. It’s a fairly open question and fairly philosophical - do the semantics of language transcend species? Certainly there is evidence that “concepts” are somewhat language agnostic in LLM embedding spaces.

https://www.earthspecies.org/about-us#team

dleeftink 1 days ago [-]
Captivating watch from Aza Raskin on the subject:

https://youtu.be/3tUXbbbMhvk

pixelpoet 3 hours ago [-]
I had the pleasure of hanging out with him at Stochastic Labs in 2018 while he was working on this, and I was working on 3D fractal stuff there. Pretty fun place, and was my first time living in the US.

At the time it seemed a bit wild / long shot, but now he just looks like a pioneer.

benlivengood 24 hours ago [-]
Presumably anyone with a multimodal transformer already pretrained on Human data could be further pretrained on animal vocalizations. I don't know whether any of the large model owners are doing this.
supriyo-biswas 1 days ago [-]
dghughes 1 days ago [-]
Quite a few geese are flying over me each day now. I've convinced myself they are saying to each other "left..left..OK straight...right a bit...OK". I'm a amazed at how precise they can be (an sometimes not) like they all stop flapping at once and glide then flap again. There were at least 24 to 40 geese all acting in perfect harmony.
lubujackson 1 days ago [-]
I remember seeing a video from the 80s about how the behavior is emergent - they made a computer program that replicated how birds fly by stating just a few axioms like don't fall behind and don't be in front.

The idea being that the V takes shape because they want to have a bird in front of them the entire time while one poor bird gets stuck out in front.

grose 1 days ago [-]
Suppafly 1 days ago [-]
>The idea being that the V takes shape because they want to have a bird in front of them the entire time while one poor bird gets stuck out in front.

I imagine they take turns like bicyclists do, right?

WaitWaitWha 1 days ago [-]
tiagod 20 hours ago [-]
These are happening this time of the year where I live. I like to go out at sunset to watch them dance. It's amazing how they coordinate so well at such close quarters, looks like a single organism from afar.
TomK32 1 days ago [-]
You think it needs a lot of coordination to fly in sync? I only have a slow clap for you, actually everybody else join in for the clapping and without any coordination whatsoever you'll notice that we clap in sync after a just 50 seconds https://www.youtube.com/watch?v=Au5tGPPcPus
bombela 1 days ago [-]
This slow clap thing is a tradition to ask for an encore/bis/repeat at concerts. So I wouldn't be so quick at stating that this is an emergent phenomenon.

But maybe this has become the tradition because when you clap for a long time it would slowly synchronize.

In the video it is quite clear a few people are seeding the synchronisation.

dghughes 1 days ago [-]
Also the changing of the point goose one guy takes over the lead falls to the back and one of the two of the V behind the main goose takes over.

And they never shut up, plus they are so loud. They're talking about something.

calebm 1 days ago [-]
So this is purely anecdotal, but it seems to me that bird songs work kind of like drum circles. A bird can sing a pattern, and see if anyone else can replay the pattern. If you can, then the initiating bird will slightly modify the pattern, and see if you are able to pick up on the nuance. With drum circles, people typically play off of patterns set by others. And both the leader and follower can tell that they are in sync with each other. I suspect that this dynamic is at the core of a lot of bird song interactions. And to try to translate that into a human language would not work well.
BurningFrog 1 days ago [-]
What you're describing is a recreational activity, that doesn't serve any practical purpose. Evolution rarely favors that.
calebm 1 days ago [-]
No... it is more than recreational. It is a way to establish shared understanding (which could be a test of how similar you are, and intelligence).
jesprenj 1 days ago [-]
It would be pretty useful if we could somehow convince birds to relay our messages using birdsongs -- just use a speaker to transmit a message, encoded in birdsong with some special preamble header, and it will get broadcast or unicast to the desired destination bird that happens to be located near a microphone that receives this message. Could this scheme beat IPoAC? Maybe if we manage to reverse engineer birdsongs well enough, BGP could be ported to birds!
ddtaylor 1 days ago [-]
To Mock a Mockingbird was a book that was sent to me and I enjoyed it. I can't fully do all of the puzzles, but they are for sure fun.

https://www.amazon.com/Mock-Mockingbird-Raymond-Smullyan/dp/...

ambientenv 19 hours ago [-]
What if we don't spend the effort, time, and money to 'decode' birdsong? What if we don't feel the need to uncover such, to reinforce the human exceptionalism? What if it really wasn't meant for us to know? What if we simply just relaxed and immersed ourselves in the beauty of birdsong? Would we be somehow deprived?
14 hours ago [-]
michaelmior 1 days ago [-]
If anyone has a spare Raspberry Pi and is looking for a fun project, consider BirdNET-PI[0]. It turns your Raspberry Pi into a 24/7 bird monitoring device. You need a microphone and then it will automatically detect birds by their songs and report them to the BirdWeather service that helps monitor bird populations.

[0] https://www.birdweather.com/birdnetpi

zcw100 1 days ago [-]
Anyone interested in this subject might find this art project interesting as well "Deep Fake Birdsong 2020" https://www.kellyheatonstudio.com/deep-fake-birdsong
more_corn 1 days ago [-]
Didn’t Douglas Adams have a bit about this? Once you figure it out you’d do anything to return to blissful ignorance. It’s all inane chatter about what’s for dinner, who’s looking hot today, and more than anyone would ever want to know about wind speed and weather conditions.
af3d 1 days ago [-]
I always fancied that they might be debating philosophical points or maybe even offering up "tweets" of wisdom. Owl: "In order to understand the very nature of the mind itself, one must earnestly seek to find the answer to this riddle: WHOoooooooo?!"
havaloc 1 days ago [-]
"He learned to communicate with birds and discovered their conversation was fantastically boring. It was all to do with windspeed, wingspans, power-to-weight ratios and a fair bit about berries."
johnaspden 22 hours ago [-]
Fancy a fuck? Fancy a fight? My tree! My tree!
11235813213455 1 days ago [-]
are they "singing" or aren't they simply talking/communicating?
lakomen 8 hours ago [-]
That's something I was thinking about, but don't have the brain juice to do. Every bird species has its own language. I'd say tempo, frequency, pause duration, it all plays a role.

Do dogs have a language? Woof woof woof sounds a lot like the crows craaaw craaaw, and while they don't speak they do communicate. Maybe there are nuances just us can't hear or recognize.

It's certainly fascinating.

How do you map sound to action, you can't analyze it independently, but also need to take the environment into account

ChrisMarshallNY 1 days ago [-]
Reminds me of this classic bit (Archive, because they have a paywall, now): https://web.archive.org/web/20160718151008/http://www.thedai...
1 days ago [-]
croisillon 1 days ago [-]
reminds me of the "i wish i could talk to ponies" comic
20 hours ago [-]
styczen 23 hours ago [-]
simple:

give me a worm or money

joelignaatius 21 hours ago [-]
So there was this German scientist that went about decoding how bees communicate where pollen sources are. I believe he won a noble prize for it. He had to individually hand paint bees. I can't remember the details and I'm too lazy to look it up.

The point here is that if you want to know what birds are saying you'd probably have to record the flapping of wings (especially with the more colorful birds) and then the bird song - their eyesight is particularly acute due to needing to eat fresh berries so body posture is most likely important in communication. A high def camera and microphone and an LLM should be able to do the job if the data is good enough on a particular species. From there you should be able to extrapolate to multiple species.

The language would probably be along the lines of a few set phrases

- wanna mate? - where's food? Here's food! - stay away - fly together south?

Stuff like that.

Dilettante_ 1 days ago [-]
[flagged]
1 days ago [-]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 19:05:10 GMT+0000 (Coordinated Universal Time) with Vercel.