It happens again and again that (Oracle) database applications are “too slow”. We then try to find out what the cause is and, if we are lucky, we find an idea to change the SQL statement that is causing the problem so that the application is “fast enough”. What is often overlooked in this whole process is that the cause(s) of the poor performance is not only due to the SQL statement, but has its origins much earlier.
The planning of a high-performance database application begins with the planning of the database. In order to be able to plan a database, the requirements for the applications to be realized with it should be known. Some of the questions that are relevant for planning:
- Is it an ERP application or a data warehouse?
- Are there tables with very different record lengths?
- Does the application mainly query or process data?
- Are there several applications that are to work with the data and do these applications have different application patterns?
The collected requirements can be used to plan the physics of the database.
Tablespaces and block sizes
The block size(s) of the tablespaces should be adapted to the requirements of the application. As a reminder: Oracle only reads entire blocks and not individual data records. If there are major differences in the requirements for the individual tables, the tables should be distributed across tablespaces with suitable different block sizes.
Small block sizes (2KB/4KB) are suitable for short record lengths that are often accessed individually. They are not suitable for long record lengths and when large quantities of data records are accessed frequently.
Large block sizes (8KB – 32KB) are suitable for long record lengths and when large amounts of data are queried sequentially.
If a large block has to be read because of a small data record or if it has to be distributed over several blocks because of a long data record (chaining), this reduces the performance of the subsequent application.
It may be worth creating tablespaces with different block sizes.
Tables
The data records of a table are saved in the blocks of the tablespace to be used. As soon as a block of a tablespace is assigned to a table, it belongs to this table, even if all data records in this block have been deleted (high water mark). During a full table scan, these now empty blocks are also read and cost time. Oracle offers several options for releasing these blocks ( export/import, alter table .. shrink space, etc…)
Oracle notes all empty or partially empty blocks of a table to which data records may still be written in a freelist.
The storage parameters pctfree and pctused (percentage values) can be used to specify the conditions under which a block is added to or removed from the freelist. If the block is free for more than pctfree, the block is placed on the freelist. If the block is more than pctused, it is removed from the freelist.
Chaining
If a data record does not fit into one block, it is distributed over several blocks. In order to read this one data record, all these blocks must be read one after the other. (Whether such data records exist in a table can be checked, for example, in user_tables in the column chain_cnt)
Chaining occurs when large data records are written in (too) small blocks, or when existing data records are extended during updating (e.g. when a varchar2 value is extended). If the data record is longer than the block size, chaining cannot be prevented. In this case, you can consider moving the table to a tablespace with a larger block size.
If the extension of a data record is foreseeable due to the application requirements, it can be ensured by setting a suitable value for pctused of the table that sufficient free space is left in the blocks so that an extension of a data record fits into the existing block. For tables with data records that are only written once and then no longer changed, pctused can be set to a high value.
If data records are regularly deleted and rewritten, pctfree should be selected so that a new data record fits into the free space of a block that is already partially occupied:
Example
Blockgröße: 4096 Bytes
mittlere Satzlänge eines Datensatzes (user_tables.avg_row_len): 400 Bytes
Two remarks:
- This average length can change.
- Many data records are certainly longer than 400 bytes.
With “select sum(c.DATA_LENGTH) from user_tab_columns c where c.TABLE_NAME = $NAME$
we can determine the maximum length of a data set.
Example a) the average data record length would be 600 bytes.
With pctfree = 15 (4096*0.15 = 614) we are on the safe side.
Example b) the average data record length would be 6000 bytes.
Since we do not know the distribution of the data record lengths, we have to make an assumption:
pctfree = 20 (2096*0.20 = 819) should be sufficient for most data records.
We delve a little deeper. If we want to work with more reliable values, we can determine better key figures with a little effort:
dump() returns a string which, among other things, returns the length in bytes of the expression:
Z.B: select dump( 'ä' ) from dual;
DUMP('Ä')
Typ=96 Len=2: 195,164
We use this to determine the length of the individual columns and add them up.
2 Notes:
- Each column requires an additional byte of memory. (We take this into account below)
- NULL columns (even several) at the end of a data set do not require any storage space. (We do not take this into account below!)
with
select '+ coalesce( to_number( regexp_substr( dump( ' || rpad( column_name, 32, ' ' ) ||q'{ ), 'Len=(\d*):', 1, 1, 'x', 1 ) ), 0) +1}' as term from user_tab_columns where table_name = $tabellenname$ order by column_id;
we can have the writing-intensive part of the following statement generated:
with len_ as ( select coalesce( to_number( regexp_substr( dump( ), 'Len=(\d):', 1, 1, 'x', 1 ) ), 0) +1 + coalesce( to_number( regexp_substr( dump( ), 'Len=(\d):', 1, 1, 'x', 1 ) ), 0) +1 + ... + coalesce( to_number( regexp_substr( dump( ), 'Len=(\d*):', 1, 1, 'x', 1 ) ), 0) +1 as bytes from $tablename$ ) select min(bytes), avg(bytes), max(bytes), stddev(bytes) from len_;
Note: this evaluation can take a long time for very large tables. In this case, it should be limited to a sufficiently large representative set ( where rownum < X ).
The values are not 100% reliable due to Note 2. However, they are sufficient for estimating the distribution of data set lengths. Assuming that the lengths are normally distributed, the minimum size for pctfree can be determined as follows:
pctfree = 100 * ( avg + stddev ) / block size (sufficient for ~84% of all data records)
pctfree = or 100 * ( avg + 2*stddev ) / block size (sufficient for ~98% of all data records)
See also: