Links

Some pointers to where real data can be delved from the web.

Time-series:

  1. Economic:      http://www.economicswebinstitute.org/ecdata.htm
  2. Industrial:       http://homes.esat.kuleuven.be/~smc/daisy/daisydata.html
  3. TSDL:            http://robjhyndman.com/TSDL/
  4. UK data:        http://data.gov.uk/about
  5. EEG:             http://sccn.ucsd.edu/~arno/fam2data/publicly_available_EEG_data.html
  6. Mike West:    http://www.stat.duke.edu/~mw/ts_data_sets.html
  7. UWO:            http://www.stats.uwo.ca/faculty/aim/epubs/datasets/default.htm

Data mining:

  1. MLdata:          http://mldata.org/

  2. Duke:              http://www.stat.duke.edu/~mw/ts_data_sets.html

  3. UCI data:        http://archive.ics.uci.edu/ml/index.html

  4. MLDATA:        http://mldata.org/

  5. INEX:              http://inex.otago.ac.nz/, http://webspam.lip6.fr/

  6. PASCAL:        http://pascallin2.ecs.soton.ac.uk/Challenges/

  7. Clopinet:         http://clopinet.com/challenges/

  8. KD nuggets:    http://www.kdnuggets.com/datasets/competitions.html

  9. Delicious:        http://www.delicious.com/pskomoroch/dataset,

  10.                           http://www.datawrangling.com/some-datasets-available-on-the-web

  11. Datamob:        http://datamob.org

  12. Ranking:         http://learningtorankchallenge.yahoo.com/,

  13. http://research.microsoft.com/en-us/projects/mslr/

  14. ed.ac.uk:         http://www.inf.ed.ac.uk/teaching/courses/dme/html/datasets0405.html

  15. Million Song:   http://labrosa.ee.columbia.edu/millionsong/

  16. Nokia:             http://research.nokia.com/mdc

  17. Yandex:          http://imat-relpred.yandex.ru/en

  18. biomed:           http://datam.i2r.a-star.edu.sg/datasets/krbd/

  19. kaggle:            http://www.kaggle.com/

  20. Mindboggle:    http://mindboggle.info/index.html

  21. CAMrA:           http://2011.camrachallenge.com/

  22. Statistical Machine Translation:     http://www.statmt.org/

BioMed:

  1. Statlib:            http://lib.stat.cmu.edu/datasets/
  2. StatSci:           http://www.statsci.org/datasets.html
  3. Klein book:     http://www.mcw.edu/biostatistics/Faculty/Faculty/JohnPKleinPhD/SurvivalAnalysisBook/DataSetsBothEditions.htm
  4. PhysioMed:     http://physionet.caregroup.harvard.edu/physiobank/database/
  5. PhysioNet:      http://www.physionet.org/challenge/
  6. GLIMs:            http://www.sci.usq.edu.au/staff/dunn/Datasets/tech-glms.html

Software:

  1. CVX:                 http://cvxr.com/cvx/

  2. Tfocs:                http://tfocs.stanford.edu/

  3. Mosek:              http://www.mosek.com/

  4. Shogun:            http://www.shogun-toolbox.org/

  5. Weka:                http://www.cs.waikato.ac.nz/ml/weka/

  6. Mahout:            http://mahout.apache.org

  7. Skikit-learn:     https://scikit-learn.org

ML Networks:

  1. NERF:             http://www.nerf.be/

  2. Kurzweil:         http://www.kurzweilai.net/

  3. Sciencemag:   http://www.sciencemag.org/site/feature/data/compsci/machine_learning.xhtml

  4. PASCAL:         http://www.pascal-network.org/

ML Conferences:

  1. NIPS

  2. ICML

  3. ECML/KDD

  4. COLT

  5. ALT

  6. ICANN

  7. ESANN

Blogs:

  1. Hunch:                       http://hunch.net/

  2. Nuit Blanche:             http://nuit-blanche.blogspot.se/

  3. My Biased Coin:        http://mybiasedcoin.blogspot.se/

  4. Mark Reid’s:              http://mark.reid.name/

  5. InherentUncertainty:  http://www.inherentuncertainty.org/

Some Books:

  1. The elements of statistical learning

  2. Learning, Prediction, Games

  3. Machine Learning

  4. Pattern Recognition