message board

Organizing Information

(posted: 01 Dec 2017)

I want you to learn econometrics and the best way to learn econometrics is to do it. But more broadly, I hope that conducting an econometric analysis will teach you how to organize information.

In the specific case here, each column in your spreadsheet represents a variable and each row represents an observation, so your task is to properly align the information that spreadsheet.

If you carefully construct that initial spreadsheet from reliable sources of data (and if you choose a good set of variables to test your null hypothesis), you should observe some clear trends in your data. Your task then is to explain those trends, test your null hypothesis and report your findings.

Gretl, of course, will help you run regressions and calculate statistics for your analysis. But Gretl is a tool. It is not the tool that is important. It is the quality of your input that is important.

That initial spreadsheet is what's important. How you organize information is what's important.

In a more general case, your information might not be the numeric data that we work with in econometrics. It might be names, addresses or whole documents and files. Your data might not even fit into a spreadsheet at all.

But some principles, like functions and variables, will remain the same. And, once again, what will be important is how you organize information.

For example, consider a different problem. Suppose you want to know what words are most commonly used to describe a product that you are selling or a stock in your portfolio.

Here, you must conduct a statistical analysis of words. To conduct such a statistical analysis of non-numeric data, what will be important is how you organize information.

Since the domain of our functions will be a word (not a number), we must define our words. Just as real numbers may be integers, rational numbers, irrational numbers, etc., our words may be nouns, verbs, adverbs, adjectives, etc.

That's why I am annotating the Sicilian language. Using that index, I can define most of the language in a very short amount of time. And with all of those definitions, we can conduct a statistical analysis. (e.g. of Sicilian Wikipedia).

Which words are the most commonly used words? Which words are the most common objects of a particular verb or of a particular preposition? Which adjectives are most frequently used to describe a particular noun? Which adverbs ... ?

How do the words used to describe a stock affect its price? How much do they affect its price?

We can find the answers to these questions, if we organize our information.


all messages >>

links and files

course outline

references and software