Abstract: Abstract: Query processing is a strategy for getting data from the database dependably. The execution of the database framework relies upon the query processing strategies that we utilized in the database system. Regularly, databases must have the capacity to reply to the clients request in getting data, In vast database frameworks, we see that they may keep running on unpredictable and, unstable environment then it turns out to be difficult to produce database queries efficiently based on the information that is accessible at the compile time, getting the database result in a timely manner deals with the procedure of query optimization. Productive processing of queries is an essential prerequisite in numerous intuitive environments that include a large amount of information. This paper explains the effect of query processing and optimization on the distributed database which requires the transmission of information between PCs in a network. The arrangement of information transmissions and local data processing is known as a distribution strategy for a query. Two cost measures, response time and total time, which are utilized to judge the quality of a distribution strategy. Moreover, different algorithms are utilized that infer distribution methodologies which have a minimal response time and minimal total time, for a special class of queries to determine the performance of the DDB.
Keywords: query processing, query optimization, distributed database.
In general, the Database system should be able to replay requests of its users. Getting data or information from a database system deals with Query Processing, and returning back the result at a convenient time is managed by Query Optimization. The Query Processing and Query Optimization are the essential part of RDBMS, the result of queries should return to its users such as a person, robotic assembly machine or another different DBMS in a timeframe that submitted by the user 5. The Query Processing displays the performance of the database while the Query Optimization displays the response time of the database system.
Furthermore, a Database Query is a request for ordering data from RDBMS to modify or restore specific data, updating and restoring data is performed through different low-level operations in RDBMS, and they also could be relational algebra operations such us project, join, select, Cartesian product, etc. 5.
A Relational Database Management System RDBMS is a specific type of DBMS which uses a relational model, it lets user store data in multiple tables which are related together by mutual fields, and it’s also the most popular type of database system such as MS SQL Server, DB2, Oracle and MySQL, Database Management System DBMS store data in a way that is easier to return manipulate and manufacture information, it enables users to form and manage database and data also can be accessed by multiple users in different locations, it also lets user create, read, update, and delete data in database, and the DBMS can control how an end-user can view data by giving the users different permission to access data in database, users of DBMS can be classified in to three types:
The query processing and query optimization are the most important component of RDBMS “these components are responsible for translating a user query, usually written in a non-procedural language like SQL – into an efficient query evaluation program that can be executed against the database.” (Saurabh Gupta, Gopal Singh Tandel, Umashankar Pandey, 2015)8.
Moreover, the query processing and optimization also have an important role in distributed database (DDB) in term of the performance of the database which measured by different algorithms, in DDB data distributed on various sites we can access those data by query requests, the query processing and optimization utilize the best way for the query to promote the execution of the query, in distributed database queries are impacted by:
1. Insertion method of the data to the server.
2. Transport time among servers.
The response time of the query is depending on transmission time between servers 1.
This paper will explain the effect of query processing and optimization on the distributed database (DDB) including (response time and transmission cost) by explaining some algorithms.
1.1 Distributed Database
A distributed database is a collection of databases that can be kept in a various computer network site “A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network” (Swati Gupta, Kuntal Saroha, Bhawna, 2011), a distributed database management system (DDBMS) is a software that allow the management of the DDB make the distribution clear to the users; each database may include different DBMS and different architectures that distribute the pursuance of procedure 10
It also has an important function nowadays when all sorts of users should be related to the companies’ database, additionally to the company’s own employees such as customers, potential customers and venders need to access to the information in the databases 9, 10.
The idea of the DDB is to store data in the different database over the network in state of having those data in a single database, those data also accessible by different user from different places 9. Moreover, people can access those data with the help of query 2. The processing of distributed query is collected of the following stages 9:
· Local processing phase.
· Reduction phase.
· Final processing phase.
A distributed database has several benefits including 2
1- Improved performance.
3- Availability and reliability.
4- Reduce communication overhead.
5- Easier system expansion.
A distributed database management system (DDBMS) prop the creation and repairing of distributed databases, where data are kept at various sites connected through a network. An objective of DDBMS is to present an easy and united interface to the users so that they can access the databases as if there were a single database. Another important thematic of DDBMS is to operate distributed queries effectively in addition to providing availability and reliability 3.
1.2 Query Processing
A Query Processing is an execution to converting a high-level query in to a low level-language. Most of the queries that suggested to the DBMS are in the high-level language such as SQL, through the Parsing and Translation stage the human readable form is converted to the form that used by DBMS which contain relational algebra expression, query tree and query graph 5. The query processing methods for multiple dimensions are divided in to five different steps bellow 7.
1. Selection Query Model.
2. Data access model.
4. Query and Data uncertainty.
5. Ranking Function.
The transformation of the high-level query to the low-level query by Query Processing is going through virus level as bellow 5 7:
A. Parsing and Translation: In this step a query submitted to DBMS to change the query to the usable form in the high-level query language such as SQL which is show the query as a string or sequence of characters 5 7.
B. Optimization: In this step the query processor gives role to the inner data structure to change this structure to the equivalent. But more effective exemplification. 5 7.
Figure number one illustrates the steps of processing high-level query.
Fig.1 steps of processing high-level query 6
C. Evaluation: The last step of the processing a query, in this step the best estimate plan nominee generated by optimization engine which is first selected then executed 5 7.
The figure bellow illustrates the steps of Query Processing in Database 8.
Fig.2 Query Processing in Database 8.
Query processing is an important solicitude in the area of distributed databases. determine the concatenation and the sites for executing this set of operations such that the operating cost (communication cost and processing cost) for processing this query is decreased, the query processing not only depends on the operations of the query, it also depend on the parameter values that linked with the query. Distributed query processing has an important impact on the performance of a distributed database system 3.
1.3 Query Optimization
The Query Optimization is responsible to return back the most effective result after exclusion by using plan in the timely manner, the Query Optimization finds a plan to decrease the overall execution cost of a query, the process of choosing the lower-cost mechanism is known as Cost-Based Optimization and there are two other strategies to reduce the execution cost of a query which are 5:
1. Heuristic-based optimization.
2. Semantic-based optimization.
The Query Optimization also has three principles which are 6:
1. QEP Generation:
It characterizes the transformations, target language and the source language, and how to build a target language from premier query, the target language reverse the aspect of run time when the QEP is estimated.
Example: “Physical representation of hash tables, an index which determines the usage of varies varieties of access operators. Operators implementing various join methods and index the QEP usage” (Dr.K. Kiran Kumar, T.M. Santhi Sri , Voruganti Vamshi priya, 2015).
2. Search strategy:
User submitted query estimated by some various QEP which are utilized to build options to find appropriate candidate.
3. Cost Function:
This is utilizing to liken of various QEP and finding the best to bring accurate result.
Example (Dr.K. Kiran Kumar, T.M. Santhi Sri, Voruganti Vamshi priya, 2015):