Hacker Newsnew | past | comments | ask | show | jobs | submit | bdamos's commentslogin

Thanks for the feedback, I just updated the animation to pause at the beginning and end to make the completions clearer: https://github.com/bamos/bamos.github.io/commit/f6a152851355...


The argument isn't that you use OpenCV: OpenFace also uses OpenCV. However, I think you should target and present your program as being a program that uses face recognition, not as a face recognition program. You are using and not crediting here that your program uses existing, off-the-shelf face recognition functionality already in OpenCV: https://github.com/jwcrawley/uWho/blob/2823479d5abf9f8f2de21...


OpenFace can optionally use a CUDA-enabled GPU, but it's not a requirement. The performance is almost real-time on a CPU. After detection (which varies depending on the input image size), the recognition takes less than a second. We have a few performance results on the FAQ at http://cmusatyalab.github.io/openface/faq/

I'm surprised (and skeptical) uWho can do detection+recognition at 15fps. I would expect face detection alone in 1280x720 images to be much slower than 15fps. On my 3.7GHz CPU with a 1050x1400px image, dlib's face detector takes about a second to run. This is also my experience with OpenCV's face detector, which I noticed your code is using. Also OpenCV's face detector returns many false positives, especially in videos. See this YouTube video for an experimental comparison: https://www.youtube.com/watch?v=LsK0hzcEyHI

Also, I think it's a strong claim that faces can't be generated from a perceptual hash. One property of perceptual hashes is that hashes that have a close hamming distance to each other are more similar (of the same person). I wouldn't be surprised if a model could successfully map perceptual hashes to faces given enough training data. I read a good paper about doing this (not specific to faces) but can't remember the reference now.

Edit: I just added some simple timing code to this sample OpenCV face detection project on my 3.60GHz machine: https://github.com/shantnu/FaceDetect On the John Lennon image from the OpenFace FAQ sized 1050x1400px, it takes 0.32 seconds, which is about 3fps. This is slightly quicker that dlib's detector on the same image, but it also returned a false positive.


I'm doing a bit of tricks to speed up the performance. And I'm perfectly find that you question the performance :) I encourage you to try it out.

My first problem/observation is that Haar cascades looove running on a GPU due to their Float-y nature. But dealing with them on a CPU frankly stinks. I was getting 1 frame/10 seconds at 800x600 with the included Haar face detector. That's effectively unusable.

Turns out there's also LBP cascades, which are integer based. And they run fast on a cpu. But, from my observations, they have many false positives. But they seem to have no issue with false negatives, so I grab all the faces, plus a few "junks".

The speedup is, is that I can use an LBP and then throw the region of interest (the potential face) onto a Haar cascade eye detector. Now that I'm dealing with much smaller pictures, haar runs acceptably. Literally, if (eyes.size > 0){is valid face.....}

Then I proceed to use the built in function on OpenCV contrib Face library. The problems with the library are numerous. Mainly, the settings are provided without good descriptions, or whatever the defaults of whatever academic papers had them set as.

Because I'm also an academic, I was able to get ahold of quite a few large datasets of face data. After doing so, I wrote a few small programs that attempted to calculate the ideal settings for the FaceRecognizer call, which I believe I did so. (The settings are in the call, in the source.)

Of course, I do get some slowdowns depending on how many faces there are (mainly, stay away from google searches for faces). But then again, 4 haars on 50x50 images is not that bad at all.

My machines used: Thinkpad T61 (8GB ram), Intel NUC (8GB ram, I5 cpu) Camera: Logitech C920 webcam

I did try my code using the max resolution the camera could acquire (1920x1080).... 1 frame/5 seconds.


Thanks for pointing out the 404, I just corrected the link.

There's an interesting discussion on lobste.rs from a few months ago about privacy issues and licensing: https://lobste.rs/s/sajz0s/openface_face_recognition_with_go....


Summary: OpenFace uses fundamentally different techniques (a deep neural network) for face recognition that OpenBR currently doesn't provide.

--

As our initial ROC curve on LFW's similarity benchmark in https://github.com/cmusatyalab/openface/blob/master/images/n... shows, this approach results in slightly improved performance. The best point is an FPR of 0.0 and TPR of 1.0 (top left). You can see today's state-of-the-art private systems in the top left, followed by open source systems, then by historical techniques OpenCV provides like Eigenfaces. The dashed line in the middle shows what randomly guessing would provide.

OpenBR is going in a great direction for reproducible and open face recognition. They provide a pipeline for preprocessing and representing faces, as well as doing similarity and classification tasks on the representations. The techniques from OpenFace could be integrated into OpenBR's pipeline.


That sounds awesome, I don't quite understand it that about pipeline but maybe in future! hehe


Thanks for the offer! Our original model `nn4.v1` should perform OK on your data if you're interested in trying to automatically predict people in new images.

Training new models is currently dominated by huge industry datasets, which currently have 100's of millions of images. My current dataset is from datasets available for research and has ~500k images.


Yes, the processing pipeline first does face detection and a simple transformation to normalize all faces to 96x96 RGB pixels. Then each face is passed into the neural network to get a 128 dimensional representation on the unit hypersphere.

For a landscape, face detection would probably not find any faces and the neural network wouldn't be called.

And an image with multiple people will have many outputs: the bounding boxes of faces and associated representations.


This depends on what you want to use face recognition for. Maybe I should say more clearly in the README who this project is for. I could have released trained classifiers on 10,000 celebrities, but I focused the projects towards providing an easy way to train new classifiers with small amounts of data. I think this direction allows for more people to use and benefit from the library.

For example, check out our YouTube video of a demo training a classifier in real-time with just 10 images per person at https://www.youtube.com/watch?v=LZJOTRkjZA4. This demo is included in the repo and the README has instructions on running it.

Also note that there is a distinction between training the neural network, which extracts the face representations, from using the features for tasks like clustering and classification.


Thanks, I'm hosting on a nonstandard port on a server. I put the main page inside a frame hosted on GitHub pages for a better URL, but this broke the links. Fixed now. :-)


I have two minor comments:

1) Linking to the original HN hiring post is helpful, and

2) I'm on a poor internet connection now, and this page took about a minute to load. http://hnhiring.com/ and the original July 2015 thread (https://news.ycombinator.com/item?id=9812245) load in ~5 seconds.


[AngJobs](https://github.com/victorantos/AngJobs) took 1-2 seconds to load the [HN filter](http://angjobs.com/#!/jobs/inbox/hn?july)


I use HN's Firebase API. This means I need a separate HTTP (XHR) call for every job record. AngJobs has their own database/API. It was my intention to avoid that pattern because I wanted to practice promises in Angular.


I think you could cache the ajax calls and gain some loading performance. AngJobs is using [angular-cache](https://github.com/jmdobry/angular-cache) for example.


(1) is probably going to make it in at some point! Unfortunately, I need to make an HTTP request for every post (that's how the HN Firebase API works) and that gets pretty expensive on slow connections. :/


Yeah, I build http://hnhiring.me well before the API via scraping. I looked into moving over after it was released but found it completely impractical for this sort of thing.

Hopefully it'll be improved at some point, especially given how bad the HN markup is (fun fact: all comments are at the same level in the dom; the indentation is achieved via a blank gif with a width property).


Ouch, that sounds painful. Glad I didn't go the scraping route :P


Firebase dev here, the websocket API would be much faster (less HTTP overhead and connection negotiation) and has the potential for more concurrent requests.

replace lots of these:

  return $http.get('https://hacker-news.firebaseio.com/v0/user/'+name+' + '/.json');
with

  return $q(function(resolve, reject){
    new Firebase('https://hackernews.firebaseio.com/v0/user/').child(name).on('value', function(snap) {
      resolve(snap.val());
    }, /*error handler here too*/)
  })


  https://www.firebase.com/blog/2014-10-07-hacker-news-api-is-firebase.html

  https://docs.angularjs.org/api/ng/service/$q

(yes the HN API design is a little funky though, but you can definitely improve the loading times with websockets)


Hey that IS awesome. I had no idea Firebase had a websocket API. Sweet!


send us a message at https://groups.google.com/forum/#!forum/firebase-talk if you have any problems.

The websocket API is the main selling point of Firebase, but I guess you were not actively looking to use Firebase in a product, rather, it was a necessity from HN API. Good luck! Don't hesitate to contact us.


Looking at the firebase api docs, I realized how unRESTful, their REST api is. No way to get a list of all stories. You can only get a list of ids and must then query the API with each id to get the content. If you plan on going this route, I def agree you store your results in your own backend db. If you want to stay stateless, look for a better HN job source, e.g., http://hnapp.com/?q=type%3Ajob+|+author%3Awhoishiring. Just one request and you can parse either the json or rss feed.


> Unfortunately, I need to make an HTTP request for every post (that's how the HN Firebase API works)

What's the total size of the jobs data? Would it be possible to grab all the JSON from the API, concatenate it and then self-host it as a single file? I imagine a $5 DO droplet with nginx would suffice for that.

It looks like firebase isn't even gzipping the JSON.


Yep certainly possible. It wasn't my aim thought as I wanted some Angular practice with promises :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: