Hougen, DeanRand, Jeremy2018-05-112018-05-112018https://hdl.handle.net/11244/299892The YaCy decentralized web search engine carries significant potential advantages in censorship resistance over centralized search engines such as Google. However, YaCy currently suffers from deficiencies in relevance of results as well as weaknesses in privacy. We have developed improvements to YaCy's relevance, including tools to generate a ranking dataset that can be fed to machine learning algorithms, fixes for some significant YaCy flaws that severely damaged ranking, and tools for ensuring that the decentralized index contains relevant results. We have also conducted an initial privacy audit of YaCy's usage of anonymizing proxies and YaCy's application-layer protocol, with recommendations for improving YaCy's privacy in both areas. We believe that this work helps pave the way for YaCy to become a credible competitor to centralized search engines. We expect future work to experiment with various machine learning implementations using our ranking dataset generation toolset, as well as implementing the improvements recommended by our initial privacy audit and conducting more extensive privacy audits once our initial recommendations are implemented.decentralizationsearch enginesmachine learninganonymityRelevance and Privacy Improvements to the YaCy Decentralized Web Search Engine