Data intro: Some notes and links

The link from class today: http://bit.ly/toolkitweek8 < (this was updated Wednesday)

Your mission this week is to research and write the assigned blog post, and work through Yau’s example exercise, pp. 27–38. Use the airport code and year assigned to you (see the link from class).

If you modify and run the Python script successfully, you will have a text file named wunder-data.txt — it will be a plain-text file with exactly 365 lines, as I showed you in class. Bring that file and your MacBook to class next week, and we will use the file in Excel to produce a data graphic showing a year’s worth of high temperatures for your city.

Now, there are two things in Yau’s example that might cause you a little confusion.

Arrays

The first is his example with images, first_image, and image[0]. There’s a common object in programming called an array. What the BeautifulSoup library does, behind the scenes, is stuff one entire Web page into an ordered collection of all the coded parts of the page. Then you can call out an array of all images (img), or all spans, or all hrefs — those are HTML tags. Once you have called out that array, you can ask for the first image — image[0] — or the second image — image[1] — etc., etc. BeautifulSoup handles most of this for you.

If I had a small array of girls’ names, it might look like this:

girlnames = (“Ann”, “Christina”, “Elizabeth”, “Maria”)

If I needed to write a script to get the fourth name in the array called girlnames, I would use this:

girlnames[3]

That would bring me the name Maria. When computers count, they start with 0 (not with 1).

In Yau’s example exercise, you’ll be using code similar to that to grab — not a girl’s name from a list of names, but a numeral (a temperature) from a list of spans-that-have-the-class-nobr.

Where You Are When You Use Terminal

Another thing that might be confusing is how to find a file you created while using Python.

By default, you’re at the top of your home directory when you are using Terminal (see an illustration).

If you stay right there and don’t change anything, when you run your Python script and generate a new file, the file will also be there, at the top of your home directory. So to find that new file, just double-click the little house, and you’ll find the file.

Remember that Terminal and Python Are Not the Same

Python has commands that only work in Python — they do not work in Terminal alone.

In Terminal, you can use all kinds of normal Linux commands. One of these is ls (short for “list”). If you type ls and press Return, you’ll see a compact list of everything in the folder (directory) where you are. (See more about that.)

Advertisements

Blog post 4: Introduction to data journalism

After reading the assigned chapters in Yau, your next task is to discover, by yourself, what people in journalism are saying about data skills and programming skills for journalists today. This is NOT about HTML, CSS or Web page design. The topic is data, data journalism, data-driven journalism, and programming. It includes Excel, computer languages such as Python, frameworks such as Ruby, databases and large data sets, etc. Read more of this post