A photo of Vincent and Michelangelo

Proof that NoL bloggers are living beings. Anyone else have photos of NoL meet ups?

Advertisements

The Cost of ‘Free’ – or why I don’t like freeware

This is a partial response to Fabio Rojas recent post on the fate of Stata, a statistics package, given the rise of a free alternative, R. Rojas and others have many reasons for why R is a good package, but for now I wish to deal with the argument that it being ‘free’ is a virtue.

R is free, but I see it as a fault because it reveals that it doesn’t have a devoted support system and because it isn’t free at all. It’s actually very costly!

If you’ve spent any time with an economist you should know that there is no such thing as a free lunch. If R is free we should not simply assume it is better. To the contrary we should ask why it is free. As I have tried to argue elsewhere, it is because when you purchase software you aren’t just purchasing a few lines of code. You’re purchasing the support system that comes with it. When a company purchases Stata, or any commercial software, they do so with the expectation that they can call a dedicated hotline for troubleshooting. As software has evolved you’ve seen companies experiment with pricing to acknowledge the fact that we don’t purchase a one time software but a continuous support system.

Consider Xbox or Playstation’s online services. Their use is charged on a per time basis because it costs money to run servers and provide customer support. Even ‘freemium’ games, which nominally don’t require any money to play, survive off micro transactions which enable companies to earn steady revenues in exchange for continuing support and new content. I would not be surprised if freemium statistical software is tried in the future – access to basic regressions is free but more advanced models cost money to run. I half joke.

But let’s assume you’re good at coding and don’t need much support outside of a few days reading an R book. Should you praise R for being ‘free’? No, because you still paid the time value of your time. Every hour spent learning how to code in R is an hour you could have spent doing any number of things.

Now to be clear, you may still want to learn R if it frees up your time in the future by automating X process. This post isn’t to argue against adopting R. My point is only to say that it isn’t free in a meaningful sense. Adopting R costs in the sense that you’re giving up a devoted support system and value of time equal to how long it takes you to become proficient in it.

It’s possible that once you account for those things R is still ‘cheaper’ than commercial software like Stata or SPSS. That is an empirical question beyond the scope of this post.

Some thoughts on the ivory tower; part 1: discrimination

I entered academia in 2009 when I started my bachelor’s degree and began graduate studies in 2014 when I entered a master’s program. I have been in the ivory tower in some form for almost a decade. Others have spent much more time in the tower than I, but I am hardly a newcomer. I hope then that I can offer thoughts on discrimination and mental health in the ivory tower.

In the past few years I have noted an increased self-aware discussion on the lack of diversity, both in terms of phenotype and ideology, in the ivory tower. The tower is full of center-left white men. I have seen various formal (e.g. #womenalsoknowstuff ) and informal groups groups advocate for greater inclusion in the tower. For the record there are non-leftist groups involved in this as well. CU Boulder has a program to increase conservative intellectuals. The Institute of Humane Studies (IHS) essentially serves to advocate for classical liberals in the tower. 

There is nothing wrong with these goals. Women also knows stuff tries do this by advertising the work of female scholars. IHS does it by inviting classical liberals to book discussions – and providing beer. Both approaches sound sensible to me. My concern is that ultimately the pipeline isn’t being fixed. Not really. Both approaches help those who managed to enter, at minimum, graduate school but do little to help solve more structural reasons for why there respective groups are rare in the tower.

Why are there so few women and classical liberals (and especially so few classical liberal women!) in academia? It’s partly cultural and partly institutional.

Minorities get made fun of in academia. Academics like to think of themselves as cosmopolitan, but it’s a big lie. A recent undergrad thesis by a Berkeley student looked at misogynistic discussions on an online forum frequented by economists. I disagree with the research design of the paper, but I believe the general argument that the tower is filled with misogyny. I also believe it’s filled with dislike for conservatives, Christians, atheists, whites, blacks, Arabs, Chinese, etc.

I don’t think the tower is unique in this. Human beings divide themselves by groups and I don’t see why that will ever change. I think the academy is just a bit whiter and a bit more lefty because of sorting effects. You can see this happening even within the tower. Classical liberals sort into economics – how many classical liberal anthropologists do you know? Not counting NoL’s chief editor? Some minorities sort into ethnic studies. How many black game theorists do you know? Native American psychometricians?

What can we do? I’m not sure. We can improve the pipeline so that grad students, and eventually faculty, get more diverse. However I suspect the sorting problem will remain. Superficially we will have more diversity, but is it really diversity if we’re sorted by discipline and subfields? Should we force new classical liberals to enroll in sociology grad programs? I don’t know. Maybe we should give up on diversity all together and focus on abolishing the state. Maybe? Who knows? What do you  all think?

By the way if you want to know what true cosmopolitanism is, visit an inner city. True cosmopolitanism is seeing blacks, Mexicans and Koreans eating pupusas made by a Honduran. Everything else is a GAP commercial concoted by HR people.

When to list working papers?

I have been updating my CV the past weekend and as a process have spent more time than I should have looking at other’s CV for reference. The experience has reminded me of two things, (1) I do not share other’s infatuation with latex and (2) I despise how working papers are listed.

My primary concern with many CVs is that some people list working papers along with peer reviewed published papers. I cannot help but feel this is weaseling. This is not aided when people list “revise and resubmits” along with actual publications. An R&R is not a publication. By all means it is a good sign that a paper will get published, but it is not a publication.

My second concern is that people list working papers, but offer no link to a draft copy. In the absence of a readily accessible draft, how am I to know if someone has a ‘real’ working paper or simply some regression results on a power point? I am especially irked when I contact an author asking for a draft of their working paper and am told that no such draft exists.

I’m still a graduate student, but if I am to be humored I think academia would benefit if it became the norm to list working papers (and R&Rs) in a separate section and if it were required to upload a draft on SSRN (or whatever your preferred depository is).

Likewise I think it best to list book reviews and other non-peer reviewed materials separately. I was surprised the other day to find people who listed op-eds in local newspapers or blog posts under publications. Don’t get me wrong – I think some blog posts (especially those on a certain site) are great reads! But peer reviewed publications they are not.

Does this sound reasonable?

The importance of understanding causal pathways: the case of affirmative action.

Let us put aside the question of whether affirmative action is a desirable goal. Instead I wish to ponder how to implement affirmative action, given that it will be implemented in some form regardless.

The logic of most affirmative action programs is that X vulnerable community’s outcomes (Y) are significantly below the average. For the sake of example let us say that X is Cherokees and Y is the number of professional baseball players from that ethno-racial group.

Y = f(X) 

A public policy analyst who simply noted the under representation of Cherokees in the MLB, without digging deeper into the causal pathway, may propose that quotas be implemented requiring teams to have a certain share of Cherokee players. Such a proposal would be a bad one. It would be bad because it could lead to privileged Cherokees gaining spots in the MLB at the expense of less privileged individuals from other ethno-racial groups.

A better public policy analysis would note that Cherokees are less likely to enter professional baseball because they are malnourished (Z). This analyst, recognizing the causal pathway, may instead propose a program be implemented to deal with malnourished individuals regardless of their ethno-racial identity.

Y = f(X); X = f(Z) 

Most affirmative action programs that I have come across are of the former type. They recognize that X ethno-racial group is performing poorly in Y outcome, and propose action without acknowledging Z. We need more programs that are designed with Z in mind.

I do not say any of this because I am an upper class white male who resents others receiving affirmative action. To the contrary. I have benefited from this type of affirmative action several times in my life. On paper I am a gold mine for a human resources worker looking to fulfill diversity quotas: I am a undocumented Hispanic of Black-Jewish descent who was raised in a low income household. I am not however vulnerable. I come from a low income household, but my Z is not low. Not really.

Despite my demographic group, I am not malnourished. I could stand to lose weight, but I am not unhealthy. I attended a state university, but my undergraduate education is comparable to that of someone who attended a public ivy. My intelligence is on the right side of the bell curve. Absent affirmative action I am confident I would achieve entry into the middle class.

Nor am I a rarity among beneficiaries. My observation is that many beneficiaries of affirmative action programs are not low on Z and left alone would achieve success on their own. Affirmative action programs are often constructed in such a way that someone low on Z could not navigate their application process. It may seem egalitarian to require applicants to submit course transcripts, to write essays, or present letters of recommendations. However these seemingly simple tasks require a level of Z that the truly under privileged do not have.

Good public policy analysis requires us to understand causal pathway of why X groups do not achieve success at similar rates as other groups. We must design programs that target undernourishment instead of simply targeting Cherokees. If we fail to do so we may have more Cherokees playing for the Dodgers, but will have failed to solve the deeper program.

Note that I say vulnerable as opposed to ‘minority’ in the above passage. This is to acknowledge that many so-called minority groups are nothing of the sort. Hispanics, Blacks, and Asians form majorities in various parts of southwest, south, and the pacific (e.g. Hawaii). Women likewise are not a minority, but are often covered by affirmative action programs. Jews are in many instances minorities, but in contemporary life are far from under represented in society’s top professions. This distinction may seem too obvious to be worth making, but it is not. Both sides of the political spectrum forget that the ultimate goal of affirmative action is to aid vulnerable individuals.  Double emphasize on individuals.

What is the optimal investment in quantitative skills?

As I plan out my summer plans I am debating how to allocate my time in skill investment. The general advice I have gotten is to increase my quantitative skills and pick up as much about coding as possible. However I am skeptical that I really should invest too much in quantitative skills. There are diminishing returns for starters.

More importantly though artificial intelligence/computing is increasing every day. When my older professors were trained they had to use IBM punch cards to run simple regressions. Today my phone has several times more the computing power, not to mention my PC. I would not be surprised if performing quantitative analysis is taken over entirely by AI within a decade or two. Even if it isn’t, it will surely be easier and require minimal knowledge of what is happening. In which case I should invest more heavily in skills that cannot be done by AI.

I am thinking, for example, of research design or substantive knowledge of research areas. AI can beat humans in chess, but I can’t think of any who have written a half decent history text.

Mind you I cannot abandon learning a base level of quantitative knowledge. AI may take over in the nex decade, but I will be on the job market and seeking tenure before then (hopefully!). 

Know your data, show your data: A rant

I am finishing up my first year of doctoral level political science studies. During that time I have read a lot of articles – approximately 550. 11 courses. 5 articles a week on average. 10 weeks. 11×5×10=550. Two things have bothered me immensely when reading these pieces: (1) it’s unclear authors know their data well, regardless of it being original or secondary data and (2) the reader is rarely showed much about the data.

I take the stance that when you use a dataset you should know it well in and out. I do not just mean that you should just have an idea if its normally distributed or has outliers. I expect you to know who collected it. I expect you to know its limitations.

For example I have read public opinion data that sampled minority populations. Given that said populations are minorities they had to oversample in areas where said groups are over represented. The problem with this is that those who live near co-ethnics are different from those who live elsewhere. This restricts the external validity of results derived from the data, but I rarely see an acknowledgement of this.

Sometimes data is flawed but it’s the best we have. That’s fine. I’m not against using flawed data. I’m willing to buy most arguments if the underlying theory is well grounded. To be honest I view statistical work to be fluff most times. If I don’t really care about the statistics, why do I care if the authors know their data well? I do because it serves as a way for authors to signal that they thought about their work. It’s similar to why artists sometimes place a “bowl of only green m&ms” requirement on their performance contracts. Artists don’t know if their contracts were read, but if their candy bowl is filled with red twizzlers they know something is wrong. I can’t monitor whether the authors took care in their manuscripts, but NOT seeing the bowl of green only m&ms gives me a heads up that something is off.

Of those 500+ articles I have read only a handful had a devoted descriptive statistics section. The logic seems to be that editors are encouraging that stuff be placed in appendices to make articles more readable. I don’t buy that argument for descriptive statistics. Moving robustness checks or replications to the appendices is fine, but descriptive stats give me a chance to actually look at the data and feel less concerned that the results are driven by outliers. In my 2nd best world all dependent variables and major independent variables would be graphed. If the data was collected in differing geographies I would want the data mapped. In my 1st best world replication files with the full dataset and dofiles would be mandatory for all papers.

I don’t think I am asking too much here. Hell, I am not even fond of empirical work. My favorite academic is Peter Leeson (GMU Econ & Law) and he rarely (ever?) does empirical work. As long as empirical work is being done in the social sciences though I expect a certain standard. Otherwise all we’re doing is engaging in math masturbation.

Tldr; I don’t trust most empirical work out there. I’ll rant about excessive literature reviews next time.