Massive Stock Datasets

When data-mining, the first step is to obtain the data that you would like to mine. I have decided that I would like to try my hand at play­ing the stock mar­ket so it became nec­es­sary for me to obtain his­tor­i­cal stock mar­ket data. To that end, I have devised a method to obtain end of day results for every list­ing on NYSE, AMEX and NASDAQ since their incep­tion. The data is in the process of being assem­bled and I expect it to be com­plete within a few days. Current esti­mates expect the data to take up approx­i­mately 2GB, mak­ing it the largest sin­gle dataset that I have ever played with. Just hav­ing this much data makes my data hoard­ing senses tingle.

I’ll prob­a­bly spend a lit­tle bit of time putting the data into an easy to under­stand and use for­mat and then I’ll start look­ing for pat­terns. I’m hop­ing to throw my mod­el­ing back­ground and expe­ri­ence at the stock mar­ket to see if I can’t beat the sys­tem. If I can beat the stock mar­ket and make bajil­lions of dol­lars (or euro if the dol­lar col­lapses) that would be pretty sweet but if I don’t, at the very least, I expect to have fun play­ing with lots and lots of numbers.

As a sec­ond approach, since it turns out to be rather dif­fi­cult to get this sort of data in the first place, I’m half con­sid­er­ing the idea of clean­ing it up a bit and then reselling it myself.

4 Responses to “Massive Stock Datasets”

  1. PRH says:

    So your hope is that past prices pre­dict future prices (instead of earn­ings, vol­ume, etc.)? You might save your­self some time (but not money) by look­ing here: http://www.crsp.com/products/index.html

  2. gwax says:

    I don’t hope that past prices pre­dict future prices, I hope that I can use past prices to help cor­re­late prices to some­thing else in a man­ner that can pre­dict future prices. At this point, there’s no need for me to pay for some­one else’s dataset, I’ve got mine built; it turns out to be only 700MB too.

  3. PRH says:

    Nice! Well remem­ber us lit­tle peo­ple when you make it big.

  4. gwax says:

    Done; when I make it big, you can be my ridicu­lously over­paid, per­sonal attorney.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">

*