Sunday, March 22, 2009

Generating Histograms

Today I have been trying to generate a histogram for a table column as part of a larger statistics project. In fact it is quite a straightforward process although some of the syntax is a bit tricky.

The table I am working with is called GP.CAR and contains a list of all cars that have raced in Formula 1 since 1961. I am trying to build a frequency histogram based on the DRIVER_KEY column which is a CHAR(4). The histogram will contain the following data:













MSCH91
APRO51
ASEN41
NMAN31
JSTE27
NLAU25
JCLA25
NPIQ23
FALO22
DHIL22
MHAK20

The histogram can be built using the following code:

DECLARE
l_statrec dbms_stats.statrec;
l_charvals dbms_stats.chararray;
l_bkvals dbms_stats.numarray;
BEGIN
NULL;
l_charvals := dbms_stats.chararray ();
l_charvals.extend (11);

l_bkvals := dbms_stats.numarray ();
l_bkvals.extend (11);

l_charvals(1) := 'MSCH'; l_bkvals(1) := 91;
l_charvals(2) := 'APRO'; l_bkvals(2) := 51;
l_charvals(3) := 'ASEN'; l_bkvals(3) := 41;
l_charvals(4) := 'NMAN'; l_bkvals(4) := 31;
l_charvals(5) := 'JSTE'; l_bkvals(5) := 27;
l_charvals(6) := 'NLAU'; l_bkvals(6) := 25;
l_charvals(7) := 'JCLA'; l_bkvals(7) := 25;
l_charvals(8) := 'NPIQ'; l_bkvals(8) := 23;
l_charvals(9) := 'FALO'; l_bkvals(9) := 22;
l_charvals(10) := 'DHIL'; l_bkvals(10) := 22;
l_charvals(11) := 'MHAK'; l_bkvals(11) := 20;

l_statrec.epc := 11;
l_statrec.bkvals := l_bkvals;
l_statrec.eavs := 0;

DBMS_STATS.PREPARE_COLUMN_VALUES (l_statrec,l_charvals);

DBMS_STATS.SET_COLUMN_STATS
(
ownname => 'GP',
tabname => 'CAR',
colname => 'DRIVER_KEY',
distcnt => 11,
density => 0.00210084,
nullcnt => 0,
srec => l_statrec,
avgclen => 4
);
END;
/

I have not yet worked out how to calculate the density. In the above example I used the value in DBA_TAB_COLUMNS generated by a previous GATHER_TABLE_STATS operation.

In order to test that the histogram had been correctly generated, I used the following code:

DECLARE
l_column_name VARCHAR2(30);
l_endpoint_number NUMBER;
l_endpoint_value NUMBER;
l_cardinality NUMBER;

CURSOR c1 IS
SELECT column_name
FROM dba_tab_columns
WHERE owner = 'GP'
AND table_name = 'CAR'
AND histogram = 'FREQUENCY'
ORDER BY column_id;

CURSOR c2 (p_column_name VARCHAR2) IS
SELECT endpoint_value,endpoint_number,
endpoint_number -
NVL (LAG (endpoint_number,1) OVER (ORDER BY endpoint_number),0)
FROM dba_histograms
WHERE owner = 'GP'
AND table_name = 'CAR'
AND column_name = p_column_name
ORDER BY endpoint_number;
BEGIN
OPEN c1;
LOOP
FETCH c1 INTO l_column_name;
EXIT WHEN c1%NOTFOUND;
DBMS_OUTPUT.PUT_LINE ('Column Name: '||l_column_name);
DBMS_OUTPUT.NEW_LINE;

DBMS_OUTPUT.PUT ('Endpoint Value ');
DBMS_OUTPUT.PUT (' ');
DBMS_OUTPUT.PUT ('Endpoint Number');
DBMS_OUTPUT.PUT (' ');
DBMS_OUTPUT.PUT ('Cardinality');
DBMS_OUTPUT.NEW_LINE;

OPEN c2 (l_column_name);
LOOP
FETCH c2 INTO l_endpoint_value,l_endpoint_number,l_cardinality;
EXIT WHEN c2%NOTFOUND;
DBMS_OUTPUT.PUT (LPAD (TO_CHAR(l_endpoint_value),32,' '));
DBMS_OUTPUT.PUT (' ');
DBMS_OUTPUT.PUT (LPAD (TO_CHAR(l_endpoint_number),12,' '));
DBMS_OUTPUT.PUT (' ');
DBMS_OUTPUT.PUT (LPAD (TO_CHAR(l_cardinality),12,' '));
DBMS_OUTPUT.NEW_LINE;
END LOOP;
CLOSE c2;
DBMS_OUTPUT.NEW_LINE;
END LOOP;
CLOSE c1;
END;
/

The values in the ENDPOINT_NUMBER column are cumulative. Therefore I have used a LAG analytic function to calculate the cardinality of each value. The above script generates output similar to the following:

Column Name: DRIVER_KEY

Endpoint Value Endpoint Number Cardinality
-------------- --------------- -----------
40149562867409400000000000000000 91 91
33912840973169800000000000000000 142 51
33918822668491100000000000000000 183 41
40656607447559200000000000000000 214 31
38592008404879700000000000000000 241 27
40654579423238300000000000000000 266 25
38559493043189900000000000000000 291 25
40662755645815800000000000000000 314 23
36478518251134200000000000000000 336 22
35454232704855500000000000000000 358 22
40127236464058400000000000000000 378 20

DBMS_OUTPUT.NEW_LINE

This week I have been writing a lengthy SQL*Plus script to extract details about object statistics from the data dictionary. Some of the join statements have not been performing well, but I did not have time to investigate them so I rewrote them using PL/SQL cursors and outputting the results using DBMS_OUTPUT.

The problem I have always had with DBMS_OUTPUT is outputting blank lines. For example:

SET SERVEROUTPUT ON
BEGIN
dbms_output.put_line ('Line One');
dbms_output.new_line;
dbms_output.put_line ('Line Two');
END;
/

returns the following output

Line One
Line Two

I have also tried using PUT_LINE with a space character. For example:

SET SERVEROUTPUT ON
BEGIN
dbms_output.put_line ('Line One');
dbms_output.put_line (' ');
dbms_output.put_line ('Line Two');
END;
/

which returns the same output:

Line One
Line Two

In other words I cannot output a blank line to improve readability

For a long time (since Oracle 6.0) I have been aware of the CHR() built-in function and this offers one solution:

SET SERVEROUTPUT ON
BEGIN
dbms_output.put_line ('Line One');
dbms_output.put_line (CHR(10));
dbms_output.put_line ('Line Two');
END;
/

which returns the required output:

Line One
 
Line Two

However, a much more elegant solution is available, at least in Oracle 10.2, possibly in earlier versions which modifies the SET SERVEROUTPUT ON statement.

SET SERVEROUTPUT ON FORMAT WRAPPED
BEGIN
dbms_output.put_line ('Line One');
dbms_output.new_line;
dbms_output.put_line ('Line Two');
END;
/

which also returns the required output:

Line One
 
Line Two

Saturday, March 21, 2009

Configuring VMWare on a Laptop

I have been using VMWare Server Linux VMs on my Windows XP laptop for over two years; with both single instance and RAC configurations across numerous VMs. Normally my laptop is connected to my home network (192.168.1.x) so I have been assigning VMs with IP addresses in the same subnet and configuring bridged networks. This worked OK while I was at home but I always had a problem when I was on the road and I could not find a suitable network connection. If all my adapters were reporting "media disconnected" I could not get a network connection between my host and the VMs.

I finally solved this problem (with help from Simon Haslam of Verition) last week. I have installed a Microsoft Loopback Adapter and configured a network address on this. The network address works even if the laptop is not connected to any networks.

The Microsoft Loopback Adapter can be installed using Control Panel -> Add Hardware which launches the Add Hardware Wizard

On the first page answer select "Yes, I have already connected the hardware"

On the next page select "Add a new hardware device"

On the next page select "Install the hardware that I manually select from a list (Advanced)"

On the next page select "Network adapters"

On the next page in the Manufacturer drop down select "Microsoft" and then in the Network Adapter drop down select "Microsoft Loopback Adapter"

The Loopback Adapter will be installed.

You can then specify an IP address for the loopback adapter using Settings -> Network Connections. I used 192.168.5.100 for my host; 192.168.5.0 is a new subnet in my network.

I found I needed to restart the laptop at this point for VMWare to see the Loopback Adapter. I when started VMWare Server and selected Host -> Virtual Network Settings and then selected the Host Virtual Network Mapping tab. For VMnet0 I selected the Microsoft Loopback Adapter.

I then started the virtual operating system (Linux) and set the IP address for the VM to 192.168.5.104 in /etc/sysconfig/network-scripts/ifcfg-eth0 and /etc/hosts. This seems to be sufficient to use putty, pscp etc to communicate between host and VM.

I have not yet tried this configuration with RAC VMs mainly because I no longer have space on my hard drive for two VMs and the shared storage. Time to upgrade my laptop...

Tuesday, March 10, 2009

Enabling and Disabling Optimizer Bug Fixes

The following was recommended by Oracle support in connection with a specific bug. However, I think some Oracle users may find it useful when testing.

Oracle releases one-off bug fixes for the CBO quite frequently. You can find out which fixes are enabled in your own instance by inspecting the 10053 trace.

For example:

fix 5385629 = enabled
fix 5705630 = disabled
fix 1234567 = enabled
fix 6122894 = enabled
fix 5842686 = disabled
fix 6006300 = disabled

In the above example bug fix 1234567 is enabled. If you want to disable this fix dynamically you can try using:

ALTER SESSION SET "_fix_control" = '1234567:off';

The output of 10053 should change as follows:

fix 5385629 = enabled
fix 5705630 = disabled
fix 1234567 = disabled *
fix 6122894 = enabled
fix 5842686 = disabled
fix 6006300 = disabled

Note the asterisk indicates a non-default setting for the fix

To enable the fix again use:

ALTER SESSION SET "_fix_control" = '1234567:on';

Obviously this should only be used in a test environment, but it is a useful way to evaluate the impact of a bug fix without needing to repeatedly install and deinstall the patch - something that is under change control with a few of my larger customers.

As this is an unsupported parameter you consult Oracle Support before using it in a production environment.