Hacker Newsnew | past | comments | ask | show | jobs | submit | smhchan's commentslogin

happy to explore other choices together with the community. Some users have voted Riak: https://predictionio.uservoice.com/forums/219398-general/sug...


Riak is a good call. A SQL db would be nice too.


it's the real-time prediction query, e.g. geospatial search, that makes use of mongo's indices.


Thanks for the clarification, the write up isn't clear. Have you benchmarked against postGIS or stock mysql? And tried any larger-than-memory databases?

We were using mongo in a suit of web applications that display the results of ML and statistical analysis of cancer data and we've found its query performance lacking in a number of cases...I think the mongo geospatial index is a pretty simple geohash setup on top of their normal query engine and I would expect it to have the same issues.

I do think this project is very interesting, just providing my feedback based on doing similar work.

Memory overhead of both mongo and hadoop would actually be my biggest worry since, especially on desktop workstations it is quite common for machine learning tools in R or python to need most of the available memory when tackling even small problems.


What would be the better alternatives, in your opinion?


MongoDB is used as the datastore for unstructured data, e.g. item attributes and user attributes. It's also used as a cache for prediction results, so queries like geospatial search can be performed.

There is no specific reason reason to stick with MongoDB only. It just happens to be the database the team has picked for the first implementation. It is very likely to support other databases in the coming future given the strong community demand.


This tutorial demonstrates how to build a serendipity-focused discovery engine with open source PredictionIO.


Is there other use case demo of personalization you guys want to see?


PredictionIO is an open source that you can download form github.


It describes the real differences among Machine Learning, Predictive Analytics, Data Analytics, Classification, Pattern Recognition and Data Science, Statistical Analysis, Data Warehousing, Data Mining, Knowledge Discovery, Artificial Intelligence and Business Intelligence etc....


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: