GithubHelp home page GithubHelp logo

foundations-of-information's Introduction

For several years, I've taught an introductory course on the intellectual foundations of information at the University of Washington. I've written this book as a gateway into the topics of the course, providing a broad overview of major topics in information, and links into the deeper research and popular literature. You can find the published form here:

https://faculty.washington.edu/ajko/books/foundations-of-information

foundations-of-information's People

Contributors

amyjko avatar antoinecheron avatar btosic avatar luisschubert avatar lukechannings avatar mathisonian avatar uppajung avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

foundations-of-information's Issues

Add right to be forgotten citation

Korenhof, P. & Koops, B-K. (2014). Identity construction and the right to be forgotten: The case of gender identity. In A. Ghezzi, A. G. Pereira, & L. Vesnić-Alujević (Eds.), The ethics of memory in a digital age: Interrogating the right to be forgotten (pp. 102-121). Palgrave Macmillan

Incorrect definition of Shannon Information/ Kolmogorov complexity in Ch3

The following quoted section, the way it's currently written, is confusing entropy with Kolmogorov complexity. https://en.wikipedia.org/wiki/Kolmogorov_complexity

"For example, the sequence of letters aaaaa has low entropy, conceptually, as it is the same letter appearing multiple times; it could be abbreviated to just 5 a’s. In contrast, the sequence of letters bkawe has high entropy, with no apparent pattern, and no apparent way to abbreviate it without losing its content. Shannon’s view of information was thus as an amount of information, measured by the compressibility of some data.

Another way to think about Shannon’s entropic idea of information is through probability: if we were to observe each character in the sequences above, and make a prediction about the likelihood of the next character, the first sequence would result in increasingly high confidence of seeing another a. In contrast, in the second sequence, the probability of seeing any particular letter is quite low. The implication of these ideas is that the more rare “events” or “observations” in some phenomenon, the more information that is required to represent it."

Specifically you say "could be abbreviated to just 5 a's" that's most definitely using Komogorov complexity which perhaps would merit inclusion in the book too. The reason why this example must use Komogorov complexity is that you don't make any reference to a random variable that describes the generation of those sequences. Without the prior distribution you must then appeal to Kolmogorov complexity. With a prior distribution, the entropy is then well defined and thus provides a bound on compressability.

Also, I don't believe the 2nd paragraph I quoted above continues in a correct way either. To follow the reasoning as it's currently written, you need at least some modeled hyperparameter governing the generation of the next letter in the sequence with a distribution over the values of the hyperparameter itself. If you don't want to complicate the example one might think we could adjust the 2nd paragraph to just say that if a letter sequence was generated by drawing a sequence of letters uniformly at random, the sequence 'aaaaa' would have low entropy and 'bkawe' would have a high entropy since it seems random -- but not true! They have the same entropy!

Perhaps better would be to use an example where you toss a coin x times, and count how many heads and tails. (like this example: https://courses.lumenlearning.com/physics/chapter/15-7-statistical-interpretation-of-entropy-and-the-second-law-of-thermodynamics-the-underlying-explanation/) Then a set of tosses all heads would be very low entropy and set of tosses with about equal heads and tails would be high entropy. The key distinction is if you view it as a sequence or a set. I think H/T are easier to think of as a set.

"Elements of Information Theory" by Cover and Thomas is a book I like on Shannon information

Anyhow, thanks for putting this book up. I've been quite enjoying it so far, just though I could help on this little detail.

preference for This / That / These / It ... throughout text

I have noticed a preference for this/these/they/it in your writing -- requiring the reader to keep track of focal subjects/ideas/objects from a clause, sentence, or paragraph earlier.

No judgement on your stylistics choice, rather -- flagging as something that could prove problematic for reader comprehension. Example from managing

He went on to frame the problem of attention allocation as an economic and management one, noting that many at the time had incorrectly framed organizational problems as one of a scarcity of information, rather than one of a scarcity of attention. Instead, he argued that in contexts of information abundance, the key problem is figuring what information exists, who needs to know it and when, and archiving it in ways that it can be accessed by those people when necessary.

Starting with "he", I look back to the previous paragraph or have Simon still in mind, very good.
(slight distraction of problem / problems - one, numerical inconsistency, moving on)

These ideas, along with Simon’s book Administrative Behavior, laid the foundation for problems of information management for the coming decades. They were taken up to shape perspectives on management in business. They were used to explain problems of advertising, in which consumer attention was the scarce resource. And they became the foundation of even personal information management problems, such e-mail spam and growing archives of personal photos.

As reader, I go back to previous paragraph to make sure that I am tracking the correct 'These'.
Pause throughout reading to make sure that I have the right 'they' -- first "they were taken up..." does not refer back to the immediately-previous plural "decades", or even "problems", but the very first "ideas." ...

"They were used" refers not to "perspectives" but ... probably back to "ideas".

By the time I encounter "...they became", I assume that you are reaching back to the original "These ideas", a callback to the previous paragraph - which takes a second to connect.

(slight distraction: "such e-email spam" -- such as, perhaps?)

Suggested "Laws as information" Supplement

Related to the "knowing the laws" paragraph, the state of Georgia had its legal code behind a massive paywall for a hot minute. (SCOTUS said stop that with some dissenting opinions that had troubling ramifications if the vote happened to swing the other way--copyright law is just the worst). Might be too much of a tangent but it's a nice contrast to the Washington State Legislature.

revisit for clarity? [managing]

With these two kinds of organization in mind, the critical difference personal and organization information management is what information is for.

I think you're missing a 'between', and might revisit structure on this sentence, too -- didn't propose edits because I can imagine this going many different ways.

Typo in Chapter 12

I think there is a typo in the following sentence. I have bolded the issue.

"many researchers are investing new forms of decentralized moderation, such as online harassment moderation systems that use friends instead of platform maintainers or moderators."

I'm enjoying the book! Thank you for writing it

Possible Typo in Chapter 15

I am unsure how the bolded word in the following sentance fits in which makes me suspect it is a typo for silo. (It is possible that I am just unfamiliar with the term "healthcare solos"

"secure medical data, creating issues of poor data interoperability between health care solos,"

Chapter 18 Typos

Missing either a conjunction or a third item to enumerate
"(e.g., by deprioritizing accessibility, locking out people with disabilities from key information resources)."

License Clash?

I notice you are using CC0 on the repository, a quit-claim to sort-of force the work into the public domain. For those who want to rely on copies/adaptations of the material, some might want to establish the provenance of such work. I wonder how that would best be done. Do you have a suggested way?

My original reason for this issue though, is to note that the CC0 quit claim is different than the CC-By-No-Derivatives that appears at https://faculty.washington.edu/ajko/books/foundations-of-information/#/

Inconsistency/typo in first paragraph

The intro begins

My great grandfather was a saddlemaker

and then later says

His children—my great grandfather and his siblings

Am I misunderstanding the relationships here or is this a typo? Either the first family member should be "great^2" or the second "great^0".

Incoherent passage, Chapter 1

Third to last paragraph: "But the power of information, therefore, derives more from its meaning, its context, and how it is received and used, and less from how it is transmitted or received." "Received" is used here for opposite cases, and "received and used" sounds almost the same as "transmitted or received." You seem to be talking about the difference between meaning/function and technology, but it's not at all clear what these terms designate and whether you are simply making the same point as below (challenging "the medium is the message").

Address epistemology

Epistemology is central to defining knowledge, but the knowledge chapter only briefly mentions it and it doesn't come up anywhere else in the book. Think about how to engage it and where.

minor edits: "it's" ch 1 + 3

Both bolded below (emphasis mine) should be sans apostrophe

ch1

Throughout, we shall see that while information is powerful in it’s capacity to shape action, and information technology can make it even more powerful, information can also be perilous to capture and share without doing great harm and injustice.

ch3

From a process perspective, the door itself is not information, but particular people in particular situations may glean different information from the door and it’s relation to other social context about its meaning.

Caption Typo in Chapter 5

In chapter 5 there is a picture with the caption "Google’s most recent quantum computer can perform a task that currently takes 10,000 in a few minutes. Credit: Google."

The units for 10,000 are missing.

"was replaced speed"

In the great prologue of the information (a.k.a knowledge?) chapter, I get what you mean, based on the following passages. I stumbled on "the wonderful anticipation that came with having to wait for information, was replaced speed" though. displaced/supplanted/forfeited by immediacy?

Minor typo in first chapter

In the section "The illusory power of technology", cause is misspelled (its spelled as cuase). Just noticed it while I was reading, great book!

image

Integrate ideas of Bowker and Star

Bowker, G. C., & Star, S. L. (2000). Sorting things out: Classification and its consequences.

At a minimum, cite it, but probably also address it early on in relation to knowledge organization.

Interoperability

In Chapter 15 you mention data interoperability. I think its an interesting topic that is worth a definfition (i.e., you define many other terms that are more common, like capitalism). I would also argue that a small aside on data interoperability is worthwhile. Perhaps in the context of how many of our current information systems came to be by leveraging interoperability and then slamming the door behind them with prickly EULAS noninmally backed by Computer Fraud and Abuse Act. Cory Doctorow has written a bit about adversarial interoperability recently which may prove to be a good citation if you want to add this stuff to your book.

Link to the published form

I lost the URL to the published form, although GitHub renders the chapter pages pretty nicely. Maybe in the README.md ?

Typo in the first chapter

Hi!

In the 1st chapter of the book, in the section describing how information teaches us, there is a simple typo : in the sentence "[...]through application programming interfaces that facilitate time and date arithemtic" the last word should be arithmetic I guess.

Thank you for the book!

Employ GitHub Discussions too

I recommend that you enable the new Discussions feature on this project repository.

That might be too meta, so one might want to introduce Discussions with a statement of the desired domain of discussion.

Information management chapter is vague

There was some vague feedback from students that the information management chapter was more confusing and abstract. Try to make it more interesting and concrete?

minor edit: capitalize U.S. constitution (ch4)

constitution -> Constitution

This census, mandated by the U.S. constitution, is a necessary part of determining many functions of government, including how seats in the U.S. House of Representatives are allocated to each state, based on population, as well as how federal support of is allowed for safety net programs.

Date incorrect in Chapter 1 The power of information

Aloha,

You write, "In February and March 2019, an international community of American and Chinese researchers worked together to model, describe, and share the structure of the SARS-CoV-2 spike." which I ought to be February and March 2020. SARS-CoV-2 hadn't yet been identified in early 2019.

Thank you for writing this entire "book."

missing word in Chp 16

"Therefore, even in the U.S., which is regarded as having some of the strongest speech protections, has limits, and political speech is included in these limits."

Likely intended to be "speech has limits".

Chapter 17 Typos

Missing comma between "British Colombia" and "California" in paragraph 2:

"... the United Kingdom, Alberta, British Columbia California, Missouri ... "

and "events" is misspelled as "evens" in paragraph 3 of the "Managing Crisis Events" section:

"Throughout such evens, particular individuals ..."

Chapter 16 Typos

Missing a space

"no organized role in society, since there is no society.Such a system maximizes"

Consider changing how you mention Pres #45
"on January 8th, 2021, Twitter permanently suspended the account of President Trump after he incited mob violence on the U.S. capital."

According to this source former President [last name] or Mr. [last name] are both acceptable. I've seen venues do both. Obviously, this is a stylistic change and kinda nitpicky but I figured I would bring it up anyway.

suggestion: seminal -> foundational (or etc.)

Perhaps consider "foundational" (or formative, important, key, vital, central, earth-moving, game-changing, ground-breaking [or other noun-gerund combinations] ... ) instead of seminal?

Four appearances in text:
ch 3

In his seminal work, he linked information to the concept of entropy from thermodynamics.

ch 4

This history of analog encodings directly informed the digital encodings that followed Shannon’s seminal work on information theory.

ch 9

What determines when information management becomes necessary? The answer to this question goes back to some seminal work from Herb Simon, who said in (yet another) seminal work, Designing Organizations for an Information-rich World

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.