spanish_experiment

This is the repository for my experiment in which I will attempt to learn Spanish, solely by consuming media in the language, without using dictionaries, learning grammar or any form of instruction.

This will be my second attempt. During my first attempt I have watched 30.5 hours of TV shows in Spanish, between the 11th and the 15th of February 2021. Daily logs, taken in video format can be found in the repository.

The other files in the repositry are a documentation of my second attempt.

Information about the experimntal subject(me):

Native language: Hungarian
Second Languages: English, Japanese, (In addition I had German classes for 4 years, but I have gained practically zero competence, although I remember a bit about the grammar and a handful of words)
Age at the start of the second attempt:21
Location during both attempts: Hungary

About the second attempt:

Start: 2021-06-28
Goal: 1000 hours spent on Spanish (Including the 30.5 from the first attempt)

I will count the hours I spend with Spanish organized into the following categories:

Audio-only
Audiovisual
Subtitled (Spanish subtitled audiovisual)
Text-only
Text with visuals (for example comic books and movies in an unrelated language with Spanish subtitles)
Speaking
Unspecified

I will occaisonally make journal entries, and I will rate at least one piece of content every week, on the following scale:

0 I don't understand anything
5 I recognize a word here and there, but I can't understand any sentence other than maybe the rare 2-3 words long one every-now and then
10 I regularly recognize words and, rarely, I can even understand simple sentences that are a few words long
15 I am familiar with most words that I encounter, but I can only understand short and simple sentences regurarly, and I only ocasionally understand more complex sentences
20 I understand a good chunk of the sentences I encounter, but I also don't understand a good chunck of them. Understanding a sentence is 'not the norm' Most of my ability to follow the text still comes from other means (e.g. guess from the visuals)
25 I understand the majority of sentences I see, but it feels like I still don't understand the imprtant ones, I can follow the text to some extent, but I loose track very often and regularly
30 I understand most sentences, and understanding them is 'the norm', and I can follow a good chunk of what is being talked about but I still loose track often, and sentences I don't understand still show up often.
35 I can follow most of what is being talked about, but I loose track occasionally, and sentences I don't understand show up regurarly, but most of the time aren't too much of a problem because of context. A lot of the nuance is lot on me.(jokes, cultural and implicit stuff)
40 I can follow pretty much all of what is talked about and understand almost all sentences with occasional exceptions, but some of the nuance is still lost on me. I still encounter a lot of words I don't know
45 I understand virtually all senteces and never loose track, but occasional unknown words still show up, and a bit of of the nuance is still lost, not quite puting me at the level of a native speaker
50 My understanding is on the level of a native speaker

When giving a rating, I will also write down what type of content it is, it's original language, and if I've not seen it before(New), if I've seen it before in Spanish(Rewatch)or in a language I understand (Familiar)

I plan on posting weekly updates on reddit.

About the files (not important to read)

The Journal folder contains the weekly journal files named journal_week[number].txt They are delimited by lines containing only a ; and empty entries are denoted by a !

The Time Data folder contains Excel files that contain when I started and ended all my sessions, they are as I exported them from Clockify except that I removed some unnecessary columns. I don't think you can open these straight from github, you have to download them.

The others are pretty self-explanatory I think.

Motivation (not important to read)

First let me say that I'm not trying to conclude anything about Krashen's theories, for two main reasons:

It's not defined well enough, it's not even clear whether all of the utterances need to be comprehensible in the input for it to be considered comprehensible or if every comprehensible utterance is considered CI on its own. Depending on how you interpret that you could twist my experiment to be about both CI and incomprehensible input.
It's well known that CI methods like TPRS work and outperform more traditional methods like grammar translation and the audio lingual method in the first 100ish hours See chapter 6 of this So I will avoid content that is specifically made to be comprehensible for learners. What linguists are unsure about is whether or not this would hold for the entire acquisition process.

What I am trying to do is try to determine how 'pure' can we get with immersion methods and still get acceptable results.

Acceptable results being achieving a B2/C1 in the same time frame as people do at the FSI, which is a 1000 hours for Spanish, i.e. when we can't tell the difference between the two results using qualitative descriptions. If we knew how low the 'bar' was for getting acceptable results, then that would mean that other methods of immersing with media that are above the 'bar' would also produce acceptable results, meaning that you could choose based on other things, like enjoyment. For example consider the following question:

"I spend 90% of my time immersing and 10% with textbook and look up a word about every 10 minutes. Some said it would be faster if I spent that 10% with anki instead of a textbook and doubled my number of lookups. What should I do?"

Answer: "You have no way of knowing which one would be faster becuase we can't even tell the difference between more extreme methods, but a guy did it with no lookups what so-ever (Assuming my experiment goes well) so both options are guaranteed to get you acceptable results. So you should just do the one that's easier for you."

There only (one long-term study)[https://espace.library.uq.edu.au/view/UQ:9b49365] that tested learning from media alone. He tried learning French by watching TV shows that were at first randomly selected from a list of tv shows on Wikipedia, I don't think he outright said whether or not he abstained from familiar shows, but after this random selection phase he watched things like Shrek, so I'm not sure how much familiar content he watched, if any. He completely avoided reading to the point that he turned his head away when signs and stuff appeared on the screen. He sat a B1 exam after 1300 hours and his combined score was a few points short of the passing score. (the fact that he got the best score on the reading section, despite never having read French should give you an idea of the credibility of that test)

He did learn thousands of words, so he successfully disproved the myth that 'You can't learn a language just by watching TV unless you are younger than 7-10' but his results are still clearly not acceptable.

I think that with the addition of reading and choosing content in a sensible way, there is a realistic chance that I will get acceptable results and a realistic chance that I won't, so I believe that this is the optimal place to start searching for the 'bar'.

If I don't get acceptable results I will try again with minimal lookups, though I probably won't have time for it any time soon.

waiyanmyintmo / spanish_experiment Goto Github PK