Secrets of optimization

 

Connecting state and local government leaders

How a few well-placed tweaks can greatly improve database performance.

In the world of databases, a few small tricks can add up to big performance gains, as the data warehouse managers at the Veterans Affairs Department discovered.

To better understand how the soon-to-be-released Microsoft SQL Server 2008 operates, the department had been running an advance copy of the database management software in a production environment for moving VA clinical transaction data into a data warehouse.

The management team found that one of the newer features in SQL Server 2008 could reduce the time queries. The feature, called partitioning, lets users divide a database table into smaller units. In VA's case, a large table of information could be divided into smaller partitions, one for each year. So when a user queries all the records for just last year, the results would be returned more quickly than if the database engine had to scan all the database records, said VA data warehouse manager Jack Bates.

It's a small tweak, but it has an impact. Whether you're running a single instance of MySQL to keep some records for a small department or overseeing a server farm humming with Oracle 11g databases, you probably could benefit from tweaking your database system.

It's a neglected art form. When managers complain that their applications are running too slow ' a complaint heard all too often ' the first place administrators look is the servers. However, 'usually server tuning is not the answer,' said Sheeri Cabral, a database administrator at remote database management provider Pythian Group.

Instead, she suggested, performance gains could be had by examining and improving how database queries are structured and how the database schema was designed. Small changes can lead to improvements in speed and accuracy.

'Generally, most performance issues come back to good design,' agreed New York-based database consultant Ronald Bradford.

Be normal ' or not

When getting a database in order, one of the first steps is normalization ' paring duplicated data down to a minimum.

At the O'Reilly Open Source conference last month in Portland, Ore., Mike Hillyer, a senior sales engineer at e-mail system provider Message Systems, explained the concept of normalization.

Especially in small offices, databases start as large spreadsheets. In database terms, they are supertables, in which all the information is captured, one line per entry, on a single table. Having all the data on one page is handy at first, but as the data grows, trying to extract and aggregate useful information from a supertable becomes onerous.

Although normalization does not improve performance per se, it can improve the quality of data and clean up the design of the database, which, in turn, can speed searches.

With normalization, each entry or row of data has a primary key ' a unique key that identifies that row from the others. More important, a normalized database has little duplicative data, or items that appear more than once.

For instance, in a database of books and their authors, an author might have written more than one book ' this is called vertical redundancy. Or a book could have more than one author, causing a duplication in columns ' called horizontal redundancy.

Such redundancies can cause trouble. If a name is entered more than once, it might be spelled incorrectly at one point, causing it to be missed by searches. When an address needs to be changed, it must be changed in multiple locations. Horizontal redundancies can balloon the size of the database.

Hillyer recounted the story of working for a political campaign that kept a database of about 150,000 voters and addresses. Each time they ran surveys, they added new columns for each question into the database. 'We were afraid we'd run out of columns in the table,' he said. Moreover, most space in the database went unused. A survey might have gone to 500 people, but the column for the answers was added to all 150,000 entries. 'So 149,500 rows had columns that would never be used,' he said.

Redundancy is minimized by creating additional tables. Instead of entering a single name or address multiple times, enter it once in one table, and refer to that table whenever the information is needed. Likewise, all the authors could be given a table of their own, and the entry for each book could simply have pointers to all of its authors.

The next stage of normalization involves arranging the tables so items in each entry make a composite key ' meaning there are no entries with duplicate data. 'Every piece of data in the table must relate to the composite primary key in order to be in that table,' Hillyer said. 'Otherwise, it should be somewhere else.'

The third form of normalization is trickier, but Hillyer said that if you complete the first two forms, the third should take care of itself. In this form, all fields in a table must rely on the primary key. For example, if you are capturing an address in a database table, you don't need the name of the state in that table. You only need the address and the ZIP code. States can be defined by ZIP codes, and the relationship between the two can be captured in another table.

On the other hand, too much normalization can slow database response, Cabral said.

She offered the example of a banking account. By strict rules of normalization, the balance of a checking account should not appear anywhere in a database. After all, it can be calculated by adding together all the entries into the account. Such a procedure, however, would require a lot of resources and time whenever a customer needed a balance. It is easier to add an entry to keep the running total of the account, even though it flaunts the design laws of normalization.

'There is trade-off,' she said. 'The more data integrity you have, the more complexity you have. Are you willing to have some discrepancies in your data in order to be superfast?'

'Think about what your application is doing, and then start off with a normalized table structure,' Cabral said. 'Then go through that structure and denormalize those parts that will be slow.'

Query analysis

At the annual Usenix conference in Boston, Cabral led a session on database tuning. One technique she insisted on for speeding responses was query tuning.

Getting any information from a relational database requires a Structured Query Language query, which is a format for how to question a database. At its most basic, it requires the name of the tables to be consulted, the criteria for pulling in all matching columns, perhaps some value found within those matching columns, and some instructions on how to group the results.

There are many ways to structure a query but if you want a speedy response, experts say, write a query that makes as little work for the database as possible.

With large systems, this approach is an art form. 'I like to take a holistic approach, ask 'What is this query doing?' ' Cabral said.

Say you want to find the names of everybody logged in to some sort of online system who live in a particular ZIP code. 'Do you start by finding everyone who has logged in at first and then find out whether or not they are near me? Or do you find everyone who is near me first and then check if they are logged in,' Cabral said. The idea is to start with the smallest pool of potential candidates and then narrow down from there. The smaller the initial draw of data, the less work for the database engine.

But keep in mind that you want to have the database do as much of the sorting as possible, so it doesn't have to be done by the application itself, said Tom Kyte, Oracle's vice president of the core technology group, who writes an advice column for database questions on the Oracle Web site.

This rarely gets done, he said. An application programmer ' not a database administrator ' builds the interface from the application to the database. As a result, programmers tend to have their applications do the sorting rather than the database, which is better suited for the task, though less-understood by the programmer.

Developers tend to write procedural code, that is to say routines that do one simple task iteratively. This process can take much longer than a well-crafted SQL statement, Kyte said.

As for writing queries, Kyte suggests that administrators pack as much information as possible in the SQL statement. 'The optimizer will have lots of information about what it is you are actually trying to do, and it can come up with optimal ways of doing it,' he said.

All databases now have query analyzers, or tools that examine a query and rework it to run more efficiently. Oracle offers a query analyzer as part of its Oracle Enterprise Manager Performance Diagnostic Pack. Other tools exist for administrators to get a better view of how their queries execute, such as the open-source Snowflake and MySQL Proxy.

Know thy database

A final rule of thumb is to know the idiosyncrasies of the database being used, Bradford said.

Although all relational databases implement SQL ' in least in varying degrees of fidelity ' each database has a different set of management tools and customs. Knowing how each works can help shave response times of the database.

'MySQL does things differently than Oracle,' Bradford said. For example, Oracle generally has one data type for all numbers. Bradford once managed a MySQL database that was originally designed by a Oracle administrator who used the generic number data type often used in Oracle databases. Simply by converting the data type to a more specific MySQL data type, Bradford was able to reduce the size of the 12G database by 85 percent.

'Know your specifics. If there is an online manual, you should be reading it from cover to cover,' Bradford said.

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.