Sunday, March 22, 2009

Generating Histograms

Today I have been trying to generate a histogram for a table column as part of a larger statistics project. In fact it is quite a straightforward process although some of the syntax is a bit tricky.

The table I am working with is called GP.CAR and contains a list of all cars that have raced in Formula 1 since 1961. I am trying to build a frequency histogram based on the DRIVER_KEY column which is a CHAR(4). The histogram will contain the following data:













MSCH91
APRO51
ASEN41
NMAN31
JSTE27
NLAU25
JCLA25
NPIQ23
FALO22
DHIL22
MHAK20

The histogram can be built using the following code:

DECLARE
l_statrec dbms_stats.statrec;
l_charvals dbms_stats.chararray;
l_bkvals dbms_stats.numarray;
BEGIN
NULL;
l_charvals := dbms_stats.chararray ();
l_charvals.extend (11);

l_bkvals := dbms_stats.numarray ();
l_bkvals.extend (11);

l_charvals(1) := 'MSCH'; l_bkvals(1) := 91;
l_charvals(2) := 'APRO'; l_bkvals(2) := 51;
l_charvals(3) := 'ASEN'; l_bkvals(3) := 41;
l_charvals(4) := 'NMAN'; l_bkvals(4) := 31;
l_charvals(5) := 'JSTE'; l_bkvals(5) := 27;
l_charvals(6) := 'NLAU'; l_bkvals(6) := 25;
l_charvals(7) := 'JCLA'; l_bkvals(7) := 25;
l_charvals(8) := 'NPIQ'; l_bkvals(8) := 23;
l_charvals(9) := 'FALO'; l_bkvals(9) := 22;
l_charvals(10) := 'DHIL'; l_bkvals(10) := 22;
l_charvals(11) := 'MHAK'; l_bkvals(11) := 20;

l_statrec.epc := 11;
l_statrec.bkvals := l_bkvals;
l_statrec.eavs := 0;

DBMS_STATS.PREPARE_COLUMN_VALUES (l_statrec,l_charvals);

DBMS_STATS.SET_COLUMN_STATS
(
ownname => 'GP',
tabname => 'CAR',
colname => 'DRIVER_KEY',
distcnt => 11,
density => 0.00210084,
nullcnt => 0,
srec => l_statrec,
avgclen => 4
);
END;
/

I have not yet worked out how to calculate the density. In the above example I used the value in DBA_TAB_COLUMNS generated by a previous GATHER_TABLE_STATS operation.

In order to test that the histogram had been correctly generated, I used the following code:

DECLARE
l_column_name VARCHAR2(30);
l_endpoint_number NUMBER;
l_endpoint_value NUMBER;
l_cardinality NUMBER;

CURSOR c1 IS
SELECT column_name
FROM dba_tab_columns
WHERE owner = 'GP'
AND table_name = 'CAR'
AND histogram = 'FREQUENCY'
ORDER BY column_id;

CURSOR c2 (p_column_name VARCHAR2) IS
SELECT endpoint_value,endpoint_number,
endpoint_number -
NVL (LAG (endpoint_number,1) OVER (ORDER BY endpoint_number),0)
FROM dba_histograms
WHERE owner = 'GP'
AND table_name = 'CAR'
AND column_name = p_column_name
ORDER BY endpoint_number;
BEGIN
OPEN c1;
LOOP
FETCH c1 INTO l_column_name;
EXIT WHEN c1%NOTFOUND;
DBMS_OUTPUT.PUT_LINE ('Column Name: '||l_column_name);
DBMS_OUTPUT.NEW_LINE;

DBMS_OUTPUT.PUT ('Endpoint Value ');
DBMS_OUTPUT.PUT (' ');
DBMS_OUTPUT.PUT ('Endpoint Number');
DBMS_OUTPUT.PUT (' ');
DBMS_OUTPUT.PUT ('Cardinality');
DBMS_OUTPUT.NEW_LINE;

OPEN c2 (l_column_name);
LOOP
FETCH c2 INTO l_endpoint_value,l_endpoint_number,l_cardinality;
EXIT WHEN c2%NOTFOUND;
DBMS_OUTPUT.PUT (LPAD (TO_CHAR(l_endpoint_value),32,' '));
DBMS_OUTPUT.PUT (' ');
DBMS_OUTPUT.PUT (LPAD (TO_CHAR(l_endpoint_number),12,' '));
DBMS_OUTPUT.PUT (' ');
DBMS_OUTPUT.PUT (LPAD (TO_CHAR(l_cardinality),12,' '));
DBMS_OUTPUT.NEW_LINE;
END LOOP;
CLOSE c2;
DBMS_OUTPUT.NEW_LINE;
END LOOP;
CLOSE c1;
END;
/

The values in the ENDPOINT_NUMBER column are cumulative. Therefore I have used a LAG analytic function to calculate the cardinality of each value. The above script generates output similar to the following:

Column Name: DRIVER_KEY

Endpoint Value Endpoint Number Cardinality
-------------- --------------- -----------
40149562867409400000000000000000 91 91
33912840973169800000000000000000 142 51
33918822668491100000000000000000 183 41
40656607447559200000000000000000 214 31
38592008404879700000000000000000 241 27
40654579423238300000000000000000 266 25
38559493043189900000000000000000 291 25
40662755645815800000000000000000 314 23
36478518251134200000000000000000 336 22
35454232704855500000000000000000 358 22
40127236464058400000000000000000 378 20

DBMS_OUTPUT.NEW_LINE

This week I have been writing a lengthy SQL*Plus script to extract details about object statistics from the data dictionary. Some of the join statements have not been performing well, but I did not have time to investigate them so I rewrote them using PL/SQL cursors and outputting the results using DBMS_OUTPUT.

The problem I have always had with DBMS_OUTPUT is outputting blank lines. For example:

SET SERVEROUTPUT ON
BEGIN
dbms_output.put_line ('Line One');
dbms_output.new_line;
dbms_output.put_line ('Line Two');
END;
/

returns the following output

Line One
Line Two

I have also tried using PUT_LINE with a space character. For example:

SET SERVEROUTPUT ON
BEGIN
dbms_output.put_line ('Line One');
dbms_output.put_line (' ');
dbms_output.put_line ('Line Two');
END;
/

which returns the same output:

Line One
Line Two

In other words I cannot output a blank line to improve readability

For a long time (since Oracle 6.0) I have been aware of the CHR() built-in function and this offers one solution:

SET SERVEROUTPUT ON
BEGIN
dbms_output.put_line ('Line One');
dbms_output.put_line (CHR(10));
dbms_output.put_line ('Line Two');
END;
/

which returns the required output:

Line One
 
Line Two

However, a much more elegant solution is available, at least in Oracle 10.2, possibly in earlier versions which modifies the SET SERVEROUTPUT ON statement.

SET SERVEROUTPUT ON FORMAT WRAPPED
BEGIN
dbms_output.put_line ('Line One');
dbms_output.new_line;
dbms_output.put_line ('Line Two');
END;
/

which also returns the required output:

Line One
 
Line Two

Saturday, March 21, 2009

Configuring VMWare on a Laptop

I have been using VMWare Server Linux VMs on my Windows XP laptop for over two years; with both single instance and RAC configurations across numerous VMs. Normally my laptop is connected to my home network (192.168.1.x) so I have been assigning VMs with IP addresses in the same subnet and configuring bridged networks. This worked OK while I was at home but I always had a problem when I was on the road and I could not find a suitable network connection. If all my adapters were reporting "media disconnected" I could not get a network connection between my host and the VMs.

I finally solved this problem (with help from Simon Haslam of Verition) last week. I have installed a Microsoft Loopback Adapter and configured a network address on this. The network address works even if the laptop is not connected to any networks.

The Microsoft Loopback Adapter can be installed using Control Panel -> Add Hardware which launches the Add Hardware Wizard

On the first page answer select "Yes, I have already connected the hardware"

On the next page select "Add a new hardware device"

On the next page select "Install the hardware that I manually select from a list (Advanced)"

On the next page select "Network adapters"

On the next page in the Manufacturer drop down select "Microsoft" and then in the Network Adapter drop down select "Microsoft Loopback Adapter"

The Loopback Adapter will be installed.

You can then specify an IP address for the loopback adapter using Settings -> Network Connections. I used 192.168.5.100 for my host; 192.168.5.0 is a new subnet in my network.

I found I needed to restart the laptop at this point for VMWare to see the Loopback Adapter. I when started VMWare Server and selected Host -> Virtual Network Settings and then selected the Host Virtual Network Mapping tab. For VMnet0 I selected the Microsoft Loopback Adapter.

I then started the virtual operating system (Linux) and set the IP address for the VM to 192.168.5.104 in /etc/sysconfig/network-scripts/ifcfg-eth0 and /etc/hosts. This seems to be sufficient to use putty, pscp etc to communicate between host and VM.

I have not yet tried this configuration with RAC VMs mainly because I no longer have space on my hard drive for two VMs and the shared storage. Time to upgrade my laptop...

Tuesday, March 10, 2009

Enabling and Disabling Optimizer Bug Fixes

The following was recommended by Oracle support in connection with a specific bug. However, I think some Oracle users may find it useful when testing.

Oracle releases one-off bug fixes for the CBO quite frequently. You can find out which fixes are enabled in your own instance by inspecting the 10053 trace.

For example:

fix 5385629 = enabled
fix 5705630 = disabled
fix 1234567 = enabled
fix 6122894 = enabled
fix 5842686 = disabled
fix 6006300 = disabled

In the above example bug fix 1234567 is enabled. If you want to disable this fix dynamically you can try using:

ALTER SESSION SET "_fix_control" = '1234567:off';

The output of 10053 should change as follows:

fix 5385629 = enabled
fix 5705630 = disabled
fix 1234567 = disabled *
fix 6122894 = enabled
fix 5842686 = disabled
fix 6006300 = disabled

Note the asterisk indicates a non-default setting for the fix

To enable the fix again use:

ALTER SESSION SET "_fix_control" = '1234567:on';

Obviously this should only be used in a test environment, but it is a useful way to evaluate the impact of a bug fix without needing to repeatedly install and deinstall the patch - something that is under change control with a few of my larger customers.

As this is an unsupported parameter you consult Oracle Support before using it in a production environment.

Friday, November 9, 2007

Automatic Memory Management in Oracle 11.1

In Oracle 11g memory can be managed in two ways; manual and automatic. Manual memory management works exactly the same as in previous releases; Automatic memory management is new.

Automatic memory management allows both SGA and PGA memory to be allocated automatically by Oracle. It uses two new initialization parameters; MEMORY_MAX_TARGET a static parameter that specifies the maximum amount of memory the instance can ever use and MEMORY_TARGET, a dynamic parameter that specifies the amount of memory that is currently available to the instance. The instance is free to allocate memory to either the SGA or the PGA according to demand. See the documentation for more information.

The interesting part is the use of shared memory by the new Automatic Memory Management feature. Using Manual Memory Management, shared memory is allocated in one large segment. On Linux you can check the size of this segment using the ipcs -m command.

However, in 11.1 with a MEMORY_MAX_TARGET of 500M and a MEMORY_TARGET of 500M, the ipcs -m command executed by the oracle user will always return something similar to the following:

$ ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 98306 root 777 393216 2 dest
0x85489c70 458764 oracle 660 4096 0

So shared memory appears to have reduced in size to 4096 bytes. Obviously this is not entirely true. The shared memory area still exists, but it is no longer reported in full by the ipcs command. Instead the ipcs command now only reports the first 4096 bytes of shared memory.

You can see more information about shared memory by listing the /dev/shm directory. For example for an instance called TEST, the first few entries might be:

-rw-r----- 1 oracle oinstall 4194304 Nov 4 12:09 ora_TEST_458764_0
-rw-r----- 1 oracle oinstall 4194304 Nov 4 12:09 ora_TEST_458764_1
-rw-r----- 1 oracle oinstall 4194304 Nov 4 12:09 ora_TEST_458764_2
-rw-r----- 1 oracle oinstall 4194304 Nov 4 12:09 ora_TEST_458764_3
-rw-r----- 1 oracle oinstall 4194304 Nov 4 12:09 ora_TEST_458764_4

In the above example, each file represents 4MB of shared memory. In my example, I have configured a MEMORY_MAX_TARGET of 500M (MEMORY_TARGET defaults to the same value). In the /dev/shm directory there are actually 126 files (ora_TEST_458764_0 to ora_TEST_458764_125 inclusive) giving a total of 4 * 126 = 504MB. I'm not currently sure where the extra 4GB comes from; it is presumably an overhead in either the first or the last file.

Down to business then; why does this matter? Because when I try this in my VMWare RHEL4 32-bit environment, I got a bus error with shared memory. The error was actually caused by /dev/shm being to small. The worrying part was that it was not identified during instance startup. Oracle just carried on working as if nothing had happened. I discovered the problem when I was porting one of my SGA attach programs to 11.1 AMM. This program is a sort of "grep" for the SGA - it takes one parameter which is a hex value (usually an address) to search for across shared memory. It reports all hits. Obviously this requires a scan of all words in shared memory - which is surprisingly fast on most systems. When scanning one of the files (in thie case file 117) I hit the bus error. Strangely enough the bus error only affected a relatively small range of words in the middle of the file; but one bus error is enough to crash my program.

Eventually I tracked down the problem to /dev/shm. The default entry in /etc/fstab was as follows:

none /dev/shm tmpfs defaults 0 0

I found that I could make the problem disappear by changing the above entry to:

none /dev/shm tmpfs defaults,size=1000M 0 0

The above command increases the default amount of space allocated to /dev/shm from 500M to 1000M. You can use the df command to check how much memory has been allocated to /dev/shm. For example:

$ df /dev/shm
Filesystem 1K-blocks Used Available Use% Mounted on
none 1024000 512940 511060 51% /dev/shm

This seems to work, but leaves me feeling a bit uneasy. I have definitely been experiencing a memory problem which Oracle does not detect at startup. Presumably Oracle would just fail with a bus error if it attempted to access the same memory, but it might not attempt to do this until the instance has been running for some time, and the error might not be that apparent. Of course the bus error could be a result of running Oracle on the unsupported VMware server. I have not yet had time to install 11g on a "real" server.

Obviously my example might be flawed, but until this can be proved I recommend caution if you are considering using Automatic Memory Management.