Software projects produce a wealth of data that is leveraged in different tasks and for different purposes: researchers collect project data for building experimental datasets; software programmers reuse code from projects; developers often explore the opportunities for getting involved in the development of a project to gain or offer expertise. Finding relevant projects that suit one needs is however currently challenging with the capabilities of existing search systems. We propose Orion, an integrated search engine architecture that combines information from different types of software repositories from multiple sources to facilitate the construction and execution of advanced search queries. Orion provides a declarative query language that gives to users access to a uniform interface where it transparently integrates different artifacts of project development and maintenance, such as source code information, version control systems metadata, bug tracking systems elements, and metadata on developer activities and interactions extracted from hosting platforms. We have built an extensible system with an initial capability of over 100,000 projects collected from the web, featuring several types of software repositories and software development artifacts.We conducted an experiment with 10 search scenarios to compare Orion with traditional search engines, and explore the need for our approach as well as the productivity of the proposed infrastructure. The results show with strong statistical significance that users find relevant projects faster and more accurately with Orion.
- Search engines,
- Software engineering
Available at: http://works.bepress.com/david_lo/143/