Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 18

Custom Search

Engine
By
Mehakdeep Singh Chhina(202193857)

Supervisor:
Thumeera R. Wanasinghe

ENGI 981B -001


Introduction

● Powerful Custom search engine based on Elasticsearch


● Allows to gather data using web crawlers.
● Place to analyze data collected from different sources
● Can behave as personal google
Motivation

● Software to handle data at different places on web.


● Multiple use cases - collect data at one place, competition analysis
● Previous work experience
Architecture

● User passes a search query to get results


● Elasticsearch acts a repository.
● Crawlers are responsible to populate
elasticsearch concurrently
Architecture
Flow Chart
Tools

● ReactJs
● NodeJs
● Material UI
● Elasticsearch
Linkedin Crawler
Linkedin Crawler

Enter the url and name of company to


see in filters
Linkedin Crawler

Wait for the crawler to crawl the posts and check the result in search page.
Search Client

● Search from the data you have crawled


● Use different keywords and filters for company specific data
Search Client
Results

● Custom Search Product with multiple use cases for different organisations
● ESG, Data analysis, customer service
● Web crawlers and developer apis used to crawl data from different platforms
● Child process in node are used so the software doesn’t get block while crawling
Challenges

● Making crawlers to scroll the page and get the required posts
● Setting up elasticsearch
● Showing images and videos related to posts in search results
Missing

● Different web crawlers specially that use developer apis and OAuth
● Downloadable search interface to add user’s website.
Future Work

● Machine learning algorithms can be added to elasticsearch


● Generate tags and user specific search results
● Search setting can be provided for users to improve rank of results, use stop words etc
● Open API’s can be provided using OAUTH for others developers to use search
services.
References
[1] M. K. T. E. M. J. a. M. N. Z. G. Gonzalez, Search Engine Indexing, U.S. Patent
Application 13/713,765., 2012.

[2] W. I. C. a. E. R. M. S. T. Kirsch, Real-time document collection search engine with


phrase indexing., U.S. Patent 5,920,854., 1999.

[3] Apache Software Foundation, Solr, [Online]. Available: http://lucene.apache.org/solr/.

[4] S. Banon, "Elasticsearch," [Online]. Available: https://www.elastic.co/.

[5] O. V. R. V. P. Nikita Kathare, "A Comprehensive Study of Elasticsearch,"


International Journal of Science and Research (IJSR).

[6] A. A. H. Karau, "foursquares now uses Elasticsearch," [Online]. Available:


https://engineering.foursquare.com/2012/08/09/foursquare-now-uses-elastic-search-and-
on-a-relatednote-slashem-also-works-with-elastic-search/
Thank You!

You might also like