• Home
  • About
  •  

    UKOUG Engineered Systems Summit 2013

    UKOUG Engineered Systems Summit 2013

    The UKOUG Engineered Systems Summit 2013 was held on 16th
    April 2013 at the CBI Conference centre in London.

    There were three streams: Exadata, Engineered Systems and Business. I mainly attended the Exadata sessions. There was plenty of interesting content and this post only highlights a few points that I thought would be of general interest.

    So what did I learn?

    Licensing

    It was interesting to hear some of the licensing variations that customers have been able to negotiate including:

    • Licensing all the storage cells, but only licensing some of the compute nodes where demand for space is high
    • Licensing all the compute nodes, but only licensing some of the storage cells where demand for CPU/memory is high
    • Not licensing RAC; running all databases as single-instance

    I would guess that Oracle sales would frown on some of the above combinations. However, a more flexible licensing model is likely to make Exadata viable for a wider set of customers, so perhaps this would be good for Oracle as well.

    Reducing the number of compute nodes not only reduces the number of Enterprise Edition and RAC licenses required but also, of course, other options commonly used with Exadata such as Partitioning.

    USE_LARGE_PAGES Parameter

    Jason Arneil (E-DBA) recommended configuring huge pages on Exadata. This means that Automatic Memory Management (MEMORY_TARGET) cannot be used.

    He also suggested setting the USE_LARGE_PAGES parameter to ONLY which prevents an instance from starting if it cannot acquire sufficient huge pages. I know this has been a problem for a few sites in the past. Huge page problems often occur following changes to memory parameters or upgrades that increase the amount of memory required by an instance. In Oracle 11.2.0.3 valid values are TRUE (default), FALSE, AUTO and ONLY.

    RMAN ACTIVE DUPLICATE

    Andy Colvin (Enkitec) discussed an Exadata upgrade/migration using Data Guard. The target Exadata system was in a new data centre and there were a lot of initial issues with the network.

    He used RMAN ACTIVE DUPLICATE to create the initial standby copy of the database. This was failing because of firewall timeouts. Apparently this operation has some sort of co-ordinator process that must establish a connection across the network for the duration of the duplicate operation – if the network times out this process will die.

    I think this is an interesting issue. Most sites test creation/recreation of a standby database against a small test database or a newly migrated database. Data volumes may grow significantly once the database is in production. If there is ever a need to rebuild a standby following a genuine disaster, the increased volume may result in a timeout error. This is worth testing and/or building into procedures if necessary.

    dstat utility

    Andy Colvin also discussed the dstat utility. This is a Linux utility that can be downloaded and installed on Exadata and non-Exadata systems. According to its man page it is a replacement for vmstat, iostat, netstat and ifstat.

    A typical command line for Exadata might be:

    dstat -dnyc -N eth0,bondeth0,bondip0 -C total -f -dsk/sda
    --net/eth0 -net/bondeth -net/bondib 
    

    Note that I may have copied down the above command incorrectly, but command is obviously worth investigating if it is more versatile than its predecessors.

    I found a copy of the dstat RPM in my OEL5U6 Linux ISO. It is a standalone RPM with no apparent dependencies. The help message is as follows:

    [oracle@vm4 bin]$ dstat -h
    Usage: dstat [-afv] [options..] [delay [count]]
    Versatile tool for generating system resource statistics
    
    Dstat options:
      -c, --cpu              enable cpu stats
         -C 0,3,total           include cpu0, cpu3 and total
      -d, --disk             enable disk stats
         -D total,hda           include hda and total
      -g, --page             enable page stats
      -i, --int              enable interrupt stats
         -I 5,eth2              include int5 and interrupt used by eth2
      -l, --load             enable load stats
      -m, --mem              enable memory stats
      -n, --net              enable network stats
         -N eth1,total          include eth1 and total
      -p, --proc             enable process stats
      -s, --swap             enable swap stats
         -S swap1,total         include swap1 and total
      -t, --time             enable time/date output
      -T, --epoch            enable time counter (seconds since epoch)
      -y, --sys              enable system stats
      --ipc                  enable ipc stats
      --lock                 enable lock stats
      --raw                  enable raw stats
      --tcp                  enable tcp stats
      --udp                  enable udp stats
      --unix                 enable unix stats
    
      -M stat1,stat2         enable external stats
         --mods stat1,stat2
    
      -a, --all              equals -cdngy (default)
      -f, --full             expand -C, -D, -I, -N and -S discovery lists
      -v, --vmstat           equals -pmgdsc -D total
    
      --integer              show integer values
      --nocolor              disable colors (implies --noupdate)
      --noheaders            disable repetitive headers
      --noupdate             disable intermediate updates
      --output file          write CSV output to file
    
      delay is the delay in seconds between each update
      count is the number of updates to display before exiting
      The default delay is 1 and count is unspecified (unlimited)
    

    Exadata Hybrid Columnar Compression

    Husnu Sensoy discussed Exadata Hybrid Columnar Compression. A few things I didn’t know included:

    Each of the four compression options uses a different standard compression algorithm including LZ0, LZ1B and BZ2. These algorithms increase in complexity and consequently the amount of CPU time required to compress and presumably to decompress.

    Internally EHCC column values are stored as deltas. For example for numbers:

    External Value Internal Value
    100001 100001
    100002 1
    100003 2
    100004 3
    100005 4
    100006 5

    For dates:

    External Value Internal Value
    01Jun2013 01Jun2013
    01Jun2013 0
    01Jun2013 0
    02Jun2013 1
    02Jun2013 1
    03Jun2013 2

    EHCC tables are stored in compression units. Each compression unit occupies a number of contiguous blocks.

    Compressed rows cannot be updated – the old row is deleted and the new row is inserted. For deletions a flag is set to indicate that the row has been deleted. The compression unit is not released until all of the rows have been marked as deleted.

    A turbo scan runs on the storage cell. It reads the entire compression unit into memory where it is required columns are decompressed, decoded, filtered and returned to the compute nodes.

    If all values do not fit into a single compression unit, then rows are chained together between compression units.

    A storage index can be created for compressed tables because the compression unit is always within a 1MB boundary.

    The DBMS_COMPRESSION package contains a procedure called GET_COMPRESSION_RATIO. This procedure can be used on both Exadata and non-Exadata systems. It calculates the compression ratio by creating both compressed and uncompressed versions of the table and then comparing them. So, interestingly, it is still possible for Oracle to create EHCC tables in non-Exadata systems; the functionality has been disabled, not removed.

    V$IOSTAT_FUNCTION_DETAIL

    Tanel Poder (Enkitec) highlighted the V$IOSTAT_FUNCTION_DETAIL dynamic performance view. It provides a useful summary of the amount of I/O performed by different types of file.

    Columns are:

    Column Name Data Type
    FUNCTION_ID NUMBER
    FUNCTION_NAME VARCHAR2(28)
    FILETYPE_ID NUMBER
    FILETYPE_NAME VARCHAR2(28)
    SMALL_READ_MEGABYTES NUMBER
    SMALL_WRITE_MEGABYTES NUMBER
    LARGE_READ_MEGABYTES NUMBER
    LARGE_WRITE_MEGABYTES NUMBER
    SMALL_READ_REQS NUMBER
    SMALL_WRITE_REQS NUMBER
    LARGE_READ_REQS NUMBER
    LARGE_WRITE_REQS NUMBER
    NUMBER_OF_WAITS NUMBER
    WAIT_TIME NUMBER

    On my single instance database there are 17 rows in this view:

    Function Name FileType Name
    DBWR Control File
    DBWR Data File
    DBWR Other
    LGWR Control File
    LGWR Log File
    LGWR Other
    Streams AQ Data File
    Buffer Cache Reads Data File
    Buffer Cache Reads Other
    Direct Reads Control File
    Direct Reads Data File
    Others Log File
    Others Archive Log
    Others Other
    Others Other

    I do not know why there are Others/Other rows.

    Buffer busy waits and gc buffer busy acquire waits

    Tanel Poder – If wait times for buffer busy waits and gc buffer busy acquire waits are high then use the parameters to determine the file number, block number and class. Use a datafile dump to determine the type of the block. If it is a bitmap (KTFB Bitmapped File Space Bitmap) then consider using larger extents. Use UNIFORM so that all extents are the same size. Do not use AUTOALLOCATE as this option initially creates small extents and the bitmap be based on the smallest extent size.

    Smart Scans

    Tanel Poder also discussed some of the reasons why Exadata may not use a smart scan when performing a full table scan. This will typically show up in the wait events which might show a high amount of time in
    multiblock physical read

    compared to the amount of time in cell
    smart table scan
    . The latter would be the preferred wait event on Exadata.

    In order to perform a smart scan, Oracle must first be able to perform a direct path read. Direct path reads depend on two parameters:

    • _small_table_threshold
    • _serial_direct_read

    The default value for the _serial_direct_read parameter changed in Oracle 11.2.0.2. On Exadata it can be set to ALWAYS to force direct reads. On non-Exadata systems the new default value of AUTO may cause some performance issues and it may be necessary to disable serial direct reads for some applications.

    Whether or not to use a direct path read is determined at run time and, for example, may vary for a scan between partitions in the same table.
    Parallel full table scans may not use direct path reads. If parallel_degree_policy is AUTO, then it may be necessary to set _parallel_cluster_cache_policy to CACHED.

    Star transformations may also disable smart scans. This is because Oracle uses bitmap indexes to select rows from the fact table. If a row is selected using a bitmap index it cannot be accessed by a smart scan.

    GoldenGate

    Matthew Walden (E-DBA) provided a short summary on the use GoldenGate for migrations.
    He highlighted MOS Note 1296168.1 which includes a script that can be used to check
    objects in a schema are compatible with GoldenGate.

    For migrations he uses archive log only mode. This reduces the impact on the online redo logs. This mode is configured in the extract parameter file using:

    TRANLOGOPTIONS ARCHIVELOGONLY
    

    While I have been researching GoldenGate, I have been wondering why the checkpoint table is stored in the target database as opposed to being stored in the file system. A possible reason is flashback – if the database is flashed back then the checkpoint table will also be flashed back. GoldenGate replication will resume at the restore point (or point to which the database was flashed back). If the checkpoint was stored outside the database, then this file would also need to be restored / modified following the flashback, otherwise GoldenGate might miss some redo or abend because of missing dependencies.

    End of post

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    *

    HTML tags are not allowed.