Recently, together with Jose Eliel Camargo I have been exploring a very nice and simple idea. When writing scientific articles, researchers put a lot of effort collecting relevant references and placing them within their text for different puposes: to give credit, to guide the reader to other points of view, to support some statement, etc. This means that looking for papers which tend to be cited close to each other in a collection of scientific articles should provide a good way to extract a group of similar or relevant articles.
With this in mind, we have extracted reference lists from inspirehep using the inspirehep python wrapper. Each reference list for us is just a list of inspire article ids. We then trained a Skip-Gram model using the gensim library implementation. We end up with a dense representation in the space of inspirehep article ids, from where we can extract similar items using cosine similarity. Very simple!
Lets look at some of the results, I will start with one of my favourites:
I retrieve the three closest articles by cosine similarity to the following classic article:
- Regularization and Renormalization of Gauge Fields Gerard ‘t Hooft, M.J.G. Veltman
I get the following results:
A Method of Gauge Invariant Regularization J.F. Ashmore
Dimensional Renormalization: The Number of Dimensions as a Regularizing Parameter C.G. Bollini, J.J. Giambiagi
Lowest order divergent graphs in nu-dimensional space C.G. Bollini, J.J. Giambiagi
These results are very good, as these articles developed simultaneously with the article by Gerard ‘t Hooft and M.J.G. Veltman the concept of dimensional regularization.
Lets look at another article, starting with
- Broken Symmetries and the Masses of Gauge Bosons Peter W. Higgs
we predict the following three most similar articles
Broken symmetries, massless particles and gauge fields Peter W. Higgs
Spontaneous Symmetry Breakdown without Massless Bosons Peter W. Higgs
Broken Symmetry and the Mass of Gauge Vector Mesons F. Englert, R. Brout
Which again, looks quite good taking into account the Nobe Prize for Physics in connection with the Higgs boson discovery. We are very happy with the results obtained so far and continue working on the topic.