Simplifying the client install

April 29, 2015, 11:30 am

≫ Next: More datapump master table analysis

≪ Previous: Migrating a big old 10g database to a new home

Our application server estate has huge mix of oracle client installs from 9 all the way to 12. These are all manually installed in varying locations and with differing types of components and languages. For example some might be installed in German with a custom install to get sqlldr and others might just be instant client or a full admin install in english.

The end user machines all have a nicely scripted client package which makes it consistent on desktop/laptops but we can;t use this on the application servers.

So how to simplify the install and make things consistent?

First up we wanted to make sure that whatever client we chose covered everything we wanted for any client requirements - so no separate odac/odp homes etc everything in one simple home.

And secondly we wanted to make it so you didn't have to go through the gui installer each time as that can easily end up with the end client becoming subtly different.

There seem to be three ways to accomplish this

1) install with response file
2) clone with clone.pl
3) run install with everything in a single command

Option 1 is fine but involves creating a response file and then making sure this is always available with the distributed software

Option 2 is what we use for database installation on Unix, i assume it works on windows too and may be an OK solution but i wasnt sure how the whole process woudl work on windows as compared to linux - it may be just as easy i just didn;t try it out

The third option seemed the easiest solution for this

So to implement a 12c client install completely from a single command we just run the following

d:\orainstaller\client\setup.exe -silent -waitforcompletion FROM_LOCATION=D:\orainstaller\client\stage\products.xml oracle.install.client.installType="Administrator" ORACLE_HOME="D:\oracle\oracle12c" ORACLE_HOME_NAME="oraclient12c" ORACLE_BASE="D:\oracle" DECLINE_SECURITY_UPDATES=true oracle.install.IsBuiltInAccount=true SELECTED_LANGUAGES=en_GB

This installs a full admin client into D:\oracle\oracle12c as long as you unzipped the software into d:\orainstaller

As soon as the command finishes the client is all ready to use - nice and simple

↧

More datapump master table analysis

April 29, 2015, 11:49 am

≫ Next: When a 12c upgrade unexpectedly breaks something - beware 9i!

≪ Previous: Simplifying the client install

We're currently looking at migrating an old 9i system to 12c (via 10g) and the datapump part is taking longer than we'd like. I'm doing a bit of analysis to see where the time is spent and have built a few queries i thought I'd share as they may be useful for others. In this case i kept the master table with KEEP_MASTER=Y and the job was called SYS_IMPORT_FULL_03

First one just shows an overall time for each type of objects that are being loaded

select object_type,
(completion_time - started) * (24 * 60 * 60) as elapsed_seconds
from (with datapump_data as (select completion_time, object_type
from sys.sys_import_full_03
where process_order = -6
and object_type <> 'TABLE_DATA'
union
select max(completion_time), 'TABLE_DATA'
from sys.sys_import_full_03 x
where object_type = 'TABLE_DATA'
and process_order > 0)
select nvl((lag(completion_time, 1) over(order by completion_time)),
completion_time) as started,
completion_time,
object_type
from datapump_data x
order by started)
order by elapsed_seconds desc

Which shows something like this

As you might expect TABLE_DATA takes the longest followed by INDEX - you might say that's not useful as it doesn't break it down further

so lets do that

select object_schema,
object_name,
(completion_time - start_time) * (24 * 60 * 60) as elapsed_time,
object_type,
process_order,
base_object_name,
base_object_schema
from sys.sys_import_full_03 x
where object_type = 'TABLE_DATA'
and process_order > 0
order by elapsed_time desc

Which shows us this

More useful w can see now how long each table took - in this case we can see one table took way longer than all the others and in fact matches the total run time of the TABLE_DATA section - this is because i used parallel and a single slave was left to do this table while all the other slaves finished all the other tables in less time that it took for this one.

A definite candidate then for closer analysis.

How about indexes though?

select object_schema,
object_name,
(completion_time - started) * (24 * 60 * 50) as elapsed
from (select object_schema,
object_name,
nvl((lag(completion_time, 1) over(order by completion_time)),
completion_time) as started,
completion_time
from sys.sys_import_full_03 x
where object_type = 'INDEX'
and process_order > 0)
order by elapsed desc

So we can see that the indexes on this table took the most time - as to be expected

How about stats then?

select base_object_schema,
base_object_name,
(completion_time - started) * (24 * 60 * 50) as elapsed
from (select base_object_name,
base_object_schema,
nvl((lag(completion_time, 1) over(order by completion_time)),
completion_time) as started,
completion_time
from sys.sys_import_full_03 x
where object_type = 'TABLE_STATISTICS'
and process_order > 0)
order by elapsed desc

Again the usual suspects i suspect maybe histograms mean that there is a lot more data for the big table as a table being big doesn't necessarily mean a lot of stats information.

So some useful stuff there.

next steps is for us to try partitioning the big table which will enable multiple slaves to work on it and truncate the second biggest table as the data is not required. Then we'll rerun the import job and see how much time we saved.

↧

When a 12c upgrade unexpectedly breaks something - beware 9i!

May 9, 2015, 12:45 pm

≫ Next: RMAN exclude

≪ Previous: More datapump master table analysis

The 12c upgrades continue to rumble on - this week we upgraded our cloud control repository (Oracle finally added support to be able to do this) - this all went fine however a couple of days later we had reports that the 'portal' isn't working.

Now the 'portal' is an ancient utility system developed in house that was considered to be unused (at least by us) - however it appeared that some user groups were still using it. It's nothing to do with the cloud control setup it just shares the database.

This was erroring with

ORA-28040: No matching authentication protocol

A quick look at the error revealed that this (which i'm sure will become quite a familiar error to everyone) is due to us trying to use a 9i client to connect to 12c - this is not supported at all - the earlier version you can use is 10g.

Right - let's have a look at the setup of the portal to see how we can fix this.

A quick look reveals that ORACLE_HOME etc is explicitly set in the environment before the application is started (the application being apache/perl) - so we just change that to the 10g home that is installed on the box and restart and it's fixed right?

Well sort of - some of the screens seem to be using the generic db connectivity from perl and do start to work with this change - however a couple of screens use the perl DBD module for oracle - these are not working....

After a bit more digging we discovered this

dump -H ./Oracle.so

./Oracle.so:

***Loader Section***
Loader Header Information
VERSION# #SYMtableENT #RELOCent LENidSTR
0x00000001 0x000000f2 0x000001d9 0x00000092

#IMPfilID OFFidSTR LENstrTBL OFFstrTBL
0x00000004 0x00002cfc 0x0000103b 0x00002d8e

***Import File Strings***
INDEX PATH BASE MEMBER
0 /usr/local/lib:/app/oracle/product/9.2.0.5/lib32/:/app/oracle/product/9.2.0.5/rdbms//lib32/:/usr/lib:/lib
1 libc.a shr.o
2 libclntsh.a shr.o

So the module looks for libclntsh.a only in the paths above - it's not considering any environment variables.

So how do we fix that?

Luckily the first path listed is /usr/local/lib so we can just create a sym link from this directory to the 10g home and it will find that first and fix the problem.

We try that and refresh the screen and get this

install_driver(Oracle) failed: Can't load '/portal/perl/lib/site_perl/5.8.6/aix/auto/DBD/Oracle/Oracle.so' for module DBD::Oracle: Could not load module /portal/perl/lib/site_perl/5.8.6/aix/auto/DBD/Oracle/Oracle.so.
Dependent module /usr/local/lib/libclntsh.a(shr.o) could not be loaded.
The module has an invalid magic number.
System error: Exec format error
Could not load module /usr/local/lib/libclntsh.a.

So progress - however i've sym linked to the wrong files - i pointed at the 64 bit ones and perl is only 32bit on this server.

we finally get the command right and run this

ln -s /app/oracle/product/10.2.0.2.DB/lib32/libclntsh.a /usr/local/lib/libclntsh.a

And the problem is fixed!

So a relatively easy fix in this case - but a mjor gotcha with 12c upgrades is that 9i cannot connect at all - bear this in mind when doing any upgrades - i'm sure there is still a lot of 9i out there....

↧

RMAN exclude

May 17, 2015, 6:52 am

≫ Next: 12c subtle backup view differences

≪ Previous: When a 12c upgrade unexpectedly breaks something - beware 9i!

A couple of weeks ago i blogged about moving a large old 10g database to a new disk array, this process is continuing across all of our servers and this week i had to move another 10g one (although for the purposes of this post the version is largely irrelevent).

Our new simplified setup is just 2 filesystems - one for data/redo/undo/control and one for arch/redo/control - this is the default setup if you make use of db_create_file_dest and db_recovery_file_dest. This makes admin much easier than having loads of separate locations.

Anyhow we'd been limiting the size of an individual filesystem to 2TB (a recommendation from the unix team) - this particular database was about 3TB. This meant the method from my earlier post couldn't work directly - i would need some extra/modified steps.

Being lazy i wanted to do this in the minimal amount of steps and tried to find the simplest way of doing that.

The backup database as copy couldn't work as the total DB wouldn't fit into one location - i needed to split the datafiles over 2 mount points - so what the easiest way to do that?

Initially i had convinced myself i could run the same command as previously and just add a 'skip tablespace' type clause - a few failed attempt at guessing syntax and then a look at the documentation revealed that this isn't possible.

So what to do?

A quick scan through the docs reveals the exclude option (not something i'd had cause to use before)

This option allows specific tablespaces to be excluded from a full database backup - this is just what i want, and i can then afterwards backup the excluded tablespace separately.

By default nothing is excluded - as can be seen from this output

RMAN> show exclude;

using target database control file instead of recovery catalog
RMAN configuration parameters for database with db_unique_name XXXX are:
RMAN configuration has no stored or default parameters

RMAN>

To exclude a tablespace we can run this

RMAN> configure exclude for tablespace bigone;

Tablespace BIGONE will be excluded from future whole database backups

new RMAN configuration parameters are successfully stored

RMAN>

this is now displayed by the show exclude command

RMAN> show exclude

2> ;

RMAN configuration parameters for database with db_unique_name XXXX are:

CONFIGURE EXCLUDE FOR TABLESPACE 'BIGONE';

RMAN>

If we now do a full database backup everything but this tablespace is backed up - we do however see this message

Starting backup at 14-MAY-2015 13:40:21

file 14 is excluded from whole database backup

file 16 is excluded from whole database backup

To remove the exclude we run

RMAN> CONFIGURE EXCLUDE FOR TABLESPACE 'BIGONE' clear;

Tablespace BIGONE will be included in future whole database backups

old RMAN configuration parameters are successfully deleted

RMAN>

So now the full database backup is complete i now just do an individual backup of the tablespaces i excluded to the 2nd filesystem location

backup as copy tablespace bigone to destination '/fs2';

Now we have the whole database (spread across the 2 new filesystems) ready to switch to as we did in the previous note.

So we switchover and all is well. Another useful trick i discovered when doing this is that after the switch the original files on the old array are classified as another copy of the database. So to now tidy up and remove all the files form the original locations we can just run

RMAN> delete copy of database;

This removes all the old files - i can see this being very useful if the original layout is a mess and intermixed with other database files - you know that rman will only remove the correct ones and take some of the human error element out of this.

I would say that this could be a sleeper problem - so make sure you switch it off - i can see this being activated and forgotten about. If you don;t then check the logs in detail you could end up with an unrecoverable tablespace at some point - so caveat emptor.....

↧

12c subtle backup view differences

June 1, 2015, 10:27 am

≫ Next: The battle of Thermopylae (well rman and flashback anyway)

≪ Previous: RMAN exclude

After our recent spate of 12c upgrades i was reviewing a couple of monitoring jobs we had setup and discovered a problem with our cloud control metric extension to check how long it has been since the last level 0 backup.

(stick with me all non cloud control folks as this is still relevant for you too...)

** quick clarification point - level 0 and full backups are essentially the same - however they are recorded as different things completely in the controlfile/rman catalog and a full backup cannot have a incremental level 1 backup taken after it - therefore all our backups are defined as incremental level 0 rather than full **

The simple check i run to see if there has been a level 0 backup involves running this SQL

select nvl(min (sysdate-bjd.start_time),31)
from v$rman_backup_job_details bjd ,v$backup_set_details bsd
where bsd.session_recid=bjd.session_recid
and bsd.session_key=bjd.session_key
and bsd.SESSION_STAMP=bjd.SESSION_STAMP
and bsd.BACKUP_TYPE||bsd.INCREMENTAL_LEVEL='D0'
and bjd.status='COMPLETED'

Essentially it tries to identify the most recent incremental level 0 backup that finished successfully, if it doesn't find any at all return 31, an alert then kicks in if this value is >8.

That's been working fine for 10g and 11g (and i think it works with 9i too but we can't connect to our 9i db's from cloud control as the hosting os is too old for the 12c agent).

However in 12c this always returns 31 (i.e. null) - but why.....?

Well after a bit of investigation it seems Oracle made a subtle change to the data presented in V$BACKUP_SET_DETAILS - now for Incremental backups the backup type is explicitly recorded as 'I' rather than 'D' - which makes a lot of sense (and would have helped me when i was writing the original query) however it means that my query doesn't work.

To fix it i just need to change 1 letter (a D to an I - in order to get this SQL)

select nvl(min (sysdate-bjd.start_time),31)
from v$rman_backup_job_details bjd ,v$backup_set_details bsd
where bsd.session_recid=bjd.session_recid
and bsd.session_key=bjd.session_key
and bsd.SESSION_STAMP=bjd.SESSION_STAMP
and bsd.BACKUP_TYPE||bsd.INCREMENTAL_LEVEL='I0'
and bjd.status='COMPLETED'

So that fixes it and gives me the result i want - however from cloud control it means i have 2 different queries - this i don't want - so how do i deal with that?

Well thanks to some clever plsql i can do this

begin
OPEN :1
$IF DBMS_DB_VERSION.VER_LE_11 $THEN
for select nvl(min (sysdate-bjd.start_time),31)
from v$rman_backup_job_details bjd ,v$backup_set_details bsd
where bsd.session_recid=bjd.session_recid
and bsd.session_key=bjd.session_key
and bsd.SESSION_STAMP=bjd.SESSION_STAMP
and bsd.BACKUP_TYPE||bsd.INCREMENTAL_LEVEL='D0'
and bjd.status='COMPLETED';
$ELSE
for select nvl(min (sysdate-bjd.start_time),31)
from v$rman_backup_job_details bjd ,v$backup_set_details bsd
where bsd.session_recid=bjd.session_recid
and bsd.session_key=bjd.session_key
and bsd.SESSION_STAMP=bjd.SESSION_STAMP
and bsd.BACKUP_TYPE||bsd.INCREMENTAL_LEVEL='I0'
and bjd.status='COMPLETED';
$END
end;

So if the version is less than or equal to 11 run the top query, if it's anything else (i.e. 12c and above run the bottom query) - neat huh? (well i thought so anyway... :-))

For reference the original cloud control walkthrough is here http://dbaharrison.blogspot.de/2013/07/12c-metric-extension-for-cloud-control.html - just change the sql to the plsql shown above and away you go.

↧

The battle of Thermopylae (well rman and flashback anyway)

June 13, 2015, 1:58 am

≫ Next: Cloud control and the SLES problem

≪ Previous: 12c subtle backup view differences

This is a very badly named post (it's a vague reference to the fact that this is my 300th blog post and it's come about as i have being having a major battle with trying to do a restore this past day.....)

So anyway back to the job in hand :

We have a very large warehouse (several TB) and a developer accidentally dropped one of their schemas, luckily we had flashback enabled so we thought it was just a simple case of flashback to the point before the schema was dropped and everything is happy - right?

Well in this case wrong - the initial problem was that the flashback logs didn't go back far enough, we didn't find that out until the flashback failed to go back that far (in hindsight perhaps we should have checked that - or even better oracle should tell us if it hasn't got the damn logs in the first place and not even start to try!)

So after that failed (which was our quick fix method) we were reduced to other methods.

Our first attempt was to do just do a point in time recovery (using rman duplicate) to another host - this would take a while as it's so big but should allow us to restore then extract the said schema - so we gave that a go and this is where things got interesting......

This is the basic rman duplicate command we were running

RMAN> run {
2> set until time "to_date('10-JUN-2015 04:30:00', 'dd-MON-yyyy hh24:mi:ss')";
3> allocate auxiliary channel t1 type 'sbt_tape' parms 'ENV=(TDPO_OPTFILE=/tsm/ORIG/conf/tdpo.opt)';
4> duplicate database ORIGto NEW
5> SKIP TABLESPACE IFC_USER_RISKCUBE,
DWH_DISTRIBUTION,
6> 7> SB_MRDD,
8> SAP_BODS,
9> DWH_STAGING,
10> DWH_INTEGRATION,
DWH_DATAMART,
11> 12> DWH_TESTDATA,
13> DWH_DERIVATION_AREA,
14> USERS;
15> }

This would run for a while and then get this error

RMAN-03002: failure of Duplicate Db command at 06/12/2015 18:54:15

RMAN-05501: aborting duplication of target database

RMAN-05541: no archived logs found in target database

The backups are clearly there though when doing reports from the rman catalog so what is going on? Anyway we messed around with that a few times until it became apparent that this was getting nowhere fast.

Then i had another idea - the schema in question was isolated to that single tablespace so we could make use of the neat restore tablespace until time command directly (and oracle would do all the clever magic with aux instances and TTS to make that happen) - so we gave that a try....

And that failed miserably too - and annoyingly i didn't save the error message - but anyway it failed - and in fact left the system in a worse state then when i started....

this step to offline the tablespace to do the restore worked ok

alter tablespace APEX_WORKSPACE_ER offline

Completed: alter tablespace APEX_WORKSPACE_ER offline

-- this is the bit where the TSPITR failed without even doing anything--

When i tried to bring it back online (bearing in mind nothing had actually be done to the file)

alter tablespace APEX_WORKSPACE_ER online

ORA-1190 signalled during: alter tablespace APEX_WORKSPACE_ER online...

Checker run found 3 new persistent data failures

ALTER DATABASE RECOVER datafile '/oracle/EDER/oradata/EDER2/datafile/o1_mf_apex_wor_bcwmswsq_.dbf'

Media Recovery Start

Serial Media Recovery started

Cannot mark control file as backup: flashback database enabled

Media Recovery failed with error 38872

ORA-283 signalled during: ALTER DATABASE RECOVER datafile '/oracle/ORIG/oradata/ORIG/datafile/o1_mf_apex_wor_bcwmswsq_.dbf' ...

2015-06-12 19:32:01.334000 +02:00

And the file and therefore the tablespace was pretty much toast now - it's possible that disabling flashback at this point may have enabled the file recovery to work but i didn't try that

OK - now what - running out of options...

Well the ultimate is always a restore from tape, i didn't want to do this directly over the top in case something else went wrong - at the moment only the one tablespace was corrupted - so i decided to restore the database to a different host from the backups.

So anyway i set all that up (tsm config files/spfile etc) and then gave it a go - this must surely work - good old fashioned restore never lets us down.....?

*** now I've skipped a little bit here as i had discovered that to do this restore to this point in time i needed the earlier incarnation of the database as flashback had messed things up....suffice to say this has to be used and this is what i set at the start ***

Now this is the point where it got really worrying - lets do the first bit of the restore (the controlfile) I have to set the dbid - as we have no controlfile at the moment rman has no idea which db we are actually wanting to restore - dbname is not enough - and we also have to go to the earlier incarnation - so that's what i do here

rman target=/

Recovery Manager: Release 11.2.0.4.0 - Production on Sat Jun 13 02:26:52 2015

connected to target database: ORIG (not mounted)

RMAN> connect catalog user/pass@rcat

connected to recovery catalog database

RMAN> set dbid=3191228040;

reset database to incarnation 1828799752;

executing command: SET DBID

database name is "ORIG" and DBID is 3191228040

RMAN>

database reset to incarnation 1828799752

RMAN>

So far so good - so lets bring back the controlfile

RMAN> run {

set until time "to_date('10-JUN-2015 04:30:00', 'dd-MON-yyyy hh24:mi:ss')";

allocate channel t1 type 'sbt_tape' parms 'ENV=(TDPO_OPTFILE=/tsm/ORIG/conf/tdpo.opt)';

restore controlfile;

sql "alter database mount";

}2> 3> 4> 5> 6>

executing command: SET until clause

allocated channel: t1

channel t1: SID=101 device type=SBT_TAPE

channel t1: Data Protection for Oracle: version 6.3.0.0

Starting restore at 13-JUN-15

new media label is 36077 for piece cf_auto_c-3191228040-20150610-15

new media label is 36135 for piece cf_auto_c-3191228040-20150610-16

channel t1: starting datafile backup set restore

channel t1: restoring control file

channel t1: reading from backup piece cf_auto_c-3191228040-20150610-15

channel t1: piece handle=cf_auto_c-3191228040-20150610-15 tag=TAG20150610T042704

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:01:45

output file name=/oracle/ORIG/oradata/ORIG/controlfile/o1_mf_bqpycl4t_.ctl

output file name=/oracle/ORIG/recovery_area/ORIG/controlfile/o1_mf_bqpyclow_.ctl

Finished restore at 13-JUN-15

sql statement: alter database mount

released channel: t1

Right - now at this point anything i tried to do to restore the rest of the database failed (and failed badly) - generally resulting in this error

RMAN-03002: failure of restore command at 06/12/2015 20:14:40

RMAN-00600: internal error, arguments [8714] [] [] [] []

Which is one I've seen before (actually on this same set of databases) and never really got to the bottom of - and there is hardly anything else on that error on the internet.

I was close to giving up when i noticed something really odd - look at the output of this command when connected to the catalog

RMAN> list incarnation of database;

List of Database Incarnations

DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time

------- ------- -------- ---------------- --- ---------- ----------

1828799650 1828799651 ORIG 3191228040 PARENT 1 25-OCT-12

1828799650 1828799752 ORIG 3191228040 CURRENT 8931318782946 05-SEP-14

Now compare that when you just connect to the target and do it from the controlfile

oracle@server:/oracle/11.2.0.4.3.DB/dbs> rman target=/

Recovery Manager: Release 11.2.0.4.0 - Production on Sat Jun 13 02:30:21 2015

connected to target database: ORIG (DBID=3191228040, not open)

RMAN> list incarnation of database;

using target database control file instead of recovery catalog

List of Database Incarnations

DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time

------- ------- -------- ---------------- --- ---------- ----------

1 1 ORIG 3191228040 CURRENT 1 25-OCT-12

2 2 ORIG 3191228040 ORPHAN 8931318782946 05-SEP-14

Incarnation numbers in the controlfile are just 1 and 2 ????? what is going on here?

Anyway it gave me an idea - lets try the restore now just using the controlfile information (the catalog was only needed to get the controlfile back) - lets now try using just the controlfile

So i connect just to that and choose the incarnation that corresponds to the controlfile version of things - then i try again (btw note the use of the skip command on restore and recover to reduce the amount of stuff coming back as i have no need for that)

RMAN> reset database to incarnation 2;

database reset to incarnation 2

RMAN>

RMAN> run {

2> set until time "to_date('10-JUN-2015 04:30:00', 'dd-MON-yyyy hh24:mi:ss')";

3> allocate channel t1 type 'sbt_tape' parms 'ENV=(TDPO_OPTFILE=/tsm/ORIG/conf/tdpo.opt)';

restore database

4> 5> SKIP TABLESPACE IFC_USER_RISKCUBE,

6> DWH_DISTRIBUTION,

7> SB_MRDD,

8> DWH_STAGING,

9> DWH_INTEGRATION,

10> DWH_DATAMART,SAP_BODS,

11> DWH_TESTDATA,

12> DWH_DERIVATION_AREA;

13> recover database SKIP TABLESPACE IFC_USER_RISKCUBE,

14> DWH_DISTRIBUTION,

15> SB_MRDD,

16> DWH_STAGING,

17> DWH_INTEGRATION,

18> DWH_DATAMART,SAP_BODS,

19> DWH_TESTDATA,

20> DWH_DERIVATION_AREA

21> ;

22> }

executing command: SET until clause

allocated channel: t1

channel t1: SID=101 device type=SBT_TAPE

channel t1: Data Protection for Oracle: version 6.3.0.0

Starting restore at 13-JUN-15

Starting implicit crosscheck backup at 13-JUN-15

Finished implicit crosscheck backup at 13-JUN-15

Starting implicit crosscheck copy at 13-JUN-15

Finished implicit crosscheck copy at 13-JUN-15

searching for all files in the recovery area

cataloging files...

cataloging done

List of Cataloged Files

=======================

File Name: /oracle/ORIG/recovery_area/ORIG/controlfile/o1_mf_bqp8nypp_.ctl

File Name: /oracle/ORIG/recovery_area/ORIG/autobackup/2015_06_13/o1_mf_n_882238842_bqpxtv6q_.bkp

channel t1: starting datafile backup set restore

channel t1: specifying datafile(s) to restore from backup set

channel t1: restoring datafile 00003 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_sys_undo_88kqvglo_.dbf

channel t1: restoring datafile 00024 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_sys_undo_8sw5yf1d_.dbf

channel t1: reading from backup piece database_online_ORIG_881978020_20150610_10639.rman

channel t1: piece handle=database_online_ORIG_881978020_20150610_10639.rman tag=TAG20150609T235003

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:19:15

channel t1: starting datafile backup set restore

channel t1: specifying datafile(s) to restore from backup set

channel t1: restoring datafile 00002 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_sysaux_88kqvdqb_.dbf

channel t1: restoring datafile 00052 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_apex_wor_bcwmswsq_.dbf

channel t1: reading from backup piece database_online_ORIG_881981006_20150610_10648.rman

channel t1: piece handle=database_online_ORIG_881981006_20150610_10648.rman tag=TAG20150609T235003

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:16:05

channel t1: starting datafile backup set restore

channel t1: specifying datafile(s) to restore from backup set

channel t1: restoring datafile 00046 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_sys_undo_b528l1mb_.dbf

channel t1: reading from backup piece database_online_ORIG_881981461_20150610_10650.rman

channel t1: piece handle=database_online_ORIG_881981461_20150610_10650.rman tag=TAG20150609T235003

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:09:35

channel t1: starting datafile backup set restore

channel t1: specifying datafile(s) to restore from backup set

channel t1: restoring datafile 00045 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_sys_undo_b528kzkl_.dbf

channel t1: reading from backup piece database_online_ORIG_881982216_20150610_10654.rman

channel t1: piece handle=database_online_ORIG_881982216_20150610_10654.rman tag=TAG20150609T235003

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:09:35

channel t1: starting datafile backup set restore

channel t1: specifying datafile(s) to restore from backup set

channel t1: restoring datafile 00001 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_system_88kqvc1l_.dbf

channel t1: restoring datafile 00044 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_sys_undo_b528ky6o_.dbf

channel t1: reading from backup piece database_online_ORIG_881983071_20150610_10655.rman

channel t1: piece handle=database_online_ORIG_881983071_20150610_10655.rman tag=TAG20150609T235003

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:09:15

channel t1: starting datafile backup set restore

channel t1: specifying datafile(s) to restore from backup set

channel t1: restoring datafile 00027 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_users_9hpo8hrx_.dbf

channel t1: restoring datafile 00043 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_sys_undo_b528ks5b_.dbf

channel t1: reading from backup piece database_online_ORIG_881983866_20150610_10658.rman

channel t1: piece handle=database_online_ORIG_881983866_20150610_10658.rman tag=TAG20150609T235003

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:08:45

channel t1: starting datafile backup set restore

channel t1: specifying datafile(s) to restore from backup set

channel t1: restoring datafile 00040 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_sys_undo_b528k3dw_.dbf

channel t1: reading from backup piece database_online_ORIG_881985397_20150610_10660.rman

channel t1: piece handle=database_online_ORIG_881985397_20150610_10660.rman tag=TAG20150609T235003

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:08:45

channel t1: starting datafile backup set restore

channel t1: specifying datafile(s) to restore from backup set

channel t1: restoring datafile 00041 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_sys_undo_b528kot2_.dbf

channel t1: reading from backup piece database_online_ORIG_881986182_20150610_10663.rman

channel t1: piece handle=database_online_ORIG_881986182_20150610_10663.rman tag=TAG20150609T235003

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:08:55

channel t1: starting datafile backup set restore

channel t1: specifying datafile(s) to restore from backup set

channel t1: restoring datafile 00042 to /oracle/EDER/oradata/ORIG/datafile/o1_mf_sys_undo_b528kqdb_.dbf

channel t1: reading from backup piece database_online_ORIG_881986787_20150610_10664.rman

channel t1: piece handle=database_online_ORIG_881986787_20150610_10664.rman tag=TAG20150609T235003

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:08:35

Finished restore at 13-JUN-15

Starting recover at 13-JUN-15

Executing: alter database datafile 53 offline

Executing: alter database datafile 54 offline

Executing: alter database datafile 55 offline

Executing: alter database datafile 5 offline

Executing: alter database datafile 22 offline

Executing: alter database datafile 6 offline

Executing: alter database datafile 23 offline

Executing: alter database datafile 25 offline

Executing: alter database datafile 30 offline

Executing: alter database datafile 31 offline

Executing: alter database datafile 32 offline

Executing: alter database datafile 33 offline

Executing: alter database datafile 34 offline

Executing: alter database datafile 7 offline

Executing: alter database datafile 28 offline

Executing: alter database datafile 29 offline

Executing: alter database datafile 4 offline

Executing: alter database datafile 8 offline

Executing: alter database datafile 9 offline

Executing: alter database datafile 10 offline

Executing: alter database datafile 11 offline

Executing: alter database datafile 12 offline

Executing: alter database datafile 13 offline

Executing: alter database datafile 14 offline

Executing: alter database datafile 15 offline

Executing: alter database datafile 16 offline

Executing: alter database datafile 17 offline

Executing: alter database datafile 18 offline

Executing: alter database datafile 19 offline

Executing: alter database datafile 20 offline

Executing: alter database datafile 21 offline

Executing: alter database datafile 35 offline

Executing: alter database datafile 36 offline

Executing: alter database datafile 37 offline

Executing: alter database datafile 38 offline

Executing: alter database datafile 39 offline

Executing: alter database datafile 47 offline

Executing: alter database datafile 48 offline

Executing: alter database datafile 49 offline

Executing: alter database datafile 50 offline

Executing: alter database datafile 51 offline

Executing: alter database datafile 26 offline

starting media recovery

new media label is 36077 for piece redolog_ORIG_881978427_20150610_10640_1.rman

new media label is 36135 for piece redolog_ORIG_881978427_20150610_10640_2.rman

new media label is 36077 for piece redolog_ORIG_881980221_20150610_10645_1.rman

new media label is 36135 for piece redolog_ORIG_881980221_20150610_10645_2.rman

new media label is 36077 for piece redolog_ORIG_881981459_20150610_10649_1.rman

new media label is 36135 for piece redolog_ORIG_881981459_20150610_10649_2.rman

new media label is 36077 for piece redolog_ORIG_881982028_20150610_10652_1.rman

new media label is 36135 for piece redolog_ORIG_881982028_20150610_10652_2.rman

new media label is 36077 for piece redolog_ORIG_881983831_20150610_10656_1.rman

new media label is 36135 for piece redolog_ORIG_881983831_20150610_10656_2.rman

new media label is 36077 for piece redolog_ORIG_881985623_20150610_10661_1.rman

new media label is 36135 for piece redolog_ORIG_881985623_20150610_10661_2.rman

new media label is 36077 for piece redolog_ORIG_881987217_20150610_10666_1.rman

new media label is 36135 for piece redolog_ORIG_881987217_20150610_10666_2.rman

channel t1: starting archived log restore to default destination

channel t1: restoring archived log

archived log thread=1 sequence=8367

channel t1: reading from backup piece redolog_ORIG_881978427_20150610_10640_1.rman

channel t1: piece handle=redolog_ORIG_881978427_20150610_10640_1.rman tag=TAG20150610T020027

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:01:55

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8367_bqq4ffhl_.arc thread=1 sequence=8367

channel default: deleting archived log(s)

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8367_bqq4ffhl_.arc RECID=12056 STAMP=882245581

channel t1: starting archived log restore to default destination

channel t1: restoring archived log

archived log thread=1 sequence=8368

channel t1: reading from backup piece redolog_ORIG_881980221_20150610_10645_1.rman

channel t1: piece handle=redolog_ORIG_881980221_20150610_10645_1.rman tag=TAG20150610T023020

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:00:07

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8368_bqq4fv0v_.arc thread=1 sequence=8368

channel default: deleting archived log(s)

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8368_bqq4fv0v_.arc RECID=12057 STAMP=882245595

channel t1: starting archived log restore to default destination

channel t1: restoring archived log

archived log thread=1 sequence=8369

channel t1: reading from backup piece redolog_ORIG_881981459_20150610_10649_1.rman

channel t1: piece handle=redolog_ORIG_881981459_20150610_10649_1.rman tag=TAG20150610T025059

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:00:03

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8369_bqq4g21k_.arc thread=1 sequence=8369

channel default: deleting archived log(s)

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8369_bqq4g21k_.arc RECID=12058 STAMP=882245602

channel t1: starting archived log restore to default destination

channel t1: restoring archived log

archived log thread=1 sequence=8370

channel t1: reading from backup piece redolog_ORIG_881982028_20150610_10652_1.rman

channel t1: piece handle=redolog_ORIG_881982028_20150610_10652_1.rman tag=TAG20150610T030028

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:00:07

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8370_bqq4g5mq_.arc thread=1 sequence=8370

channel default: deleting archived log(s)

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8370_bqq4g5mq_.arc RECID=12059 STAMP=882245605

channel t1: starting archived log restore to default destination

channel t1: restoring archived log

archived log thread=1 sequence=8371

channel t1: reading from backup piece redolog_ORIG_881983831_20150610_10656_1.rman

channel t1: piece handle=redolog_ORIG_881983831_20150610_10656_1.rman tag=TAG20150610T033031

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:00:03

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8371_bqq4gfqb_.arc thread=1 sequence=8371

channel default: deleting archived log(s)

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8371_bqq4gfqb_.arc RECID=12060 STAMP=882245613

channel t1: starting archived log restore to default destination

channel t1: restoring archived log

archived log thread=1 sequence=8372

channel t1: reading from backup piece redolog_ORIG_881985623_20150610_10661_1.rman

channel t1: piece handle=redolog_ORIG_881985623_20150610_10661_1.rman tag=TAG20150610T040023

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:00:03

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8372_bqq4gjg6_.arc thread=1 sequence=8372

channel default: deleting archived log(s)

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8372_bqq4gjg6_.arc RECID=12061 STAMP=882245616

channel t1: starting archived log restore to default destination

channel t1: restoring archived log

archived log thread=1 sequence=8373

channel t1: reading from backup piece redolog_ORIG_881987217_20150610_10666_1.rman

channel t1: piece handle=redolog_ORIG_881987217_20150610_10666_1.rman tag=TAG20150610T042657

channel t1: restored backup piece 1

channel t1: restore complete, elapsed time: 00:00:03

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8373_bqq4gn4g_.arc thread=1 sequence=8373

channel default: deleting archived log(s)

archived log file name=/oracle/ORIG/recovery_area/ORIG/archivelog/2015_06_13/o1_mf_1_8373_bqq4gn4g_.arc RECID=12062 STAMP=882245620

unable to find archived log

archived log thread=1 sequence=8374

Oracle Error:

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01245: offline file 4 will be lost if RESETLOGS is done

ORA-01110: data file 4: '/oracle/EDER/oradata/ORIG/datafile/o1_mf_sap_bods_88krp7c9_.dbf'

released channel: t1

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of recover command at 06/13/2015 04:13:43

RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 8374 and starting SCN of 9111768484863

RMAN>

RMAN> exit

It's only gone and actually worked! (well 99% worked - the very last archivelog is missing - but in this case i don't need that - it only contains 5 minutes of data and nothing related to the schema in question as no-one was working on it at 04:25->04:30 in the morning (times for this from the alert log on the original database).

So now i just need to open the db and extract the stuff.

To open the db i have to offline drop all the files that i didn't restore otherwise it won't let me open the db - to do that i run some SQL to generate me some SQL

SQL> select 'alter database datafile '||file#||' offline drop ;'

2 from v$recover_file where online_status='OFFLINE'

3 /

'ALTERDATABASEDATAFILE'||FILE#||'OFFLINEDROP;'

--------------------------------------------------------------------------------

alter database datafile 4 offline drop ;

.....

alter database datafile 55 offline drop ;

then run it

SQL> alter database datafile 4 offline drop ;

Database altered.

etc etc

Now i can open the db

SQL> alter database open resetlogs;

Database altered.

Hooray!

Now i just have to export the problem schema

expdp / schemas=apex_workspace_er dumpfile=miracle.dmp reuse_dumpfiles=y

Export: Release 11.2.0.4.0 - Production on Sat Jun 13 09:46:10 2015

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning option

Starting "OSAORACLE"."SYS_EXPORT_SCHEMA_01": /******** schemas=apex_workspace_er dumpfile=miracle.dmp reuse_dumpfiles=y

Estimate in progress using BLOCKS method...

Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA

Total estimation using BLOCKS method: 1024 KB

Processing object type SCHEMA_EXPORT/USER

Processing object type SCHEMA_EXPORT/SYSTEM_GRANT

Processing object type SCHEMA_EXPORT/ROLE_GRANT

Processing object type SCHEMA_EXPORT/DEFAULT_ROLE

Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA

Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA

Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE

Processing object type SCHEMA_EXPORT/TABLE/TABLE

Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT

Processing object type SCHEMA_EXPORT/TABLE/COMMENT

Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_SPEC

Processing object type SCHEMA_EXPORT/FUNCTION/FUNCTION

Processing object type SCHEMA_EXPORT/PACKAGE/COMPILE_PACKAGE/PACKAGE_SPEC/ALTER_PACKAGE_SPEC

Processing object type SCHEMA_EXPORT/FUNCTION/ALTER_FUNCTION

Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT

Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS

Processing object type SCHEMA_EXPORT/VIEW/VIEW

Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_BODY

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT

Processing object type SCHEMA_EXPORT/TABLE/TRIGGER

Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS

. . exported "APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP" 17.64 KB 2 rows

. . exported "APEX_WORKSPACE_ER"."APEX_GENERIC_UPLOAD_DATA_TMP" 24.50 KB 8 rows

. . exported "APEX_WORKSPACE_ER"."APEX_REF_ADJUSTMENT_KPI_SUBTYP" 5.648 KB 6 rows

. . exported "APEX_WORKSPACE_ER"."APEX_ROLE" 8.390 KB 10 rows

. . exported "APEX_WORKSPACE_ER"."APEX_STEERING_XGME_TMP" 11.78 KB 40 rows

. . exported "APEX_WORKSPACE_ER"."APEX_USER" 8.726 KB 34 rows

. . exported "APEX_WORKSPACE_ER"."APEX_USER_ROLE" 12.89 KB 96 rows

Master table "OSAORACLE"."SYS_EXPORT_SCHEMA_01" successfully loaded/unloaded

******************************************************************************

Dump file set for OSAORACLE.SYS_EXPORT_SCHEMA_01 is:

/oracle/11.2.0.4.0.DB/rdbms/log/miracle.dmp

Job "OSAORACLE"."SYS_EXPORT_SCHEMA_01" successfully completed at Sat Jun 13 09:46:42 2015 elapsed 0 00:00:25

So that's all good.

Now copy it over to the original machine and import it in (dropping the bad tablespace first) - the import ran but with loads of errors as i forgot to create the tablespace again - see log here

impdp / directory=tmp dumpfile=miracle.dmp

Import: Release 11.2.0.4.0 - Production on Sat Jun 13 09:52:12 2015

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning option

Master table "OSAORACLE"."SYS_IMPORT_FULL_01" successfully loaded/unloaded

Starting "OSAORACLE"."SYS_IMPORT_FULL_01": /******** directory=tmp dumpfile=miracle.dmp

Processing object type SCHEMA_EXPORT/USER

ORA-31684: Object type USER:"APEX_WORKSPACE_ER" already exists

Processing object type SCHEMA_EXPORT/SYSTEM_GRANT

Processing object type SCHEMA_EXPORT/ROLE_GRANT

Processing object type SCHEMA_EXPORT/DEFAULT_ROLE

Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA

ORA-39083: Object type TABLESPACE_QUOTA failed to create with error:

ORA-00959: tablespace 'APEX_WORKSPACE_ER' does not exist

Failing sql is:

DECLARE TEMP_COUNT NUMBER; SQLSTR VARCHAR2(200); BEGIN SQLSTR := 'ALTER USER "APEX_WORKSPACE_ER" QUOTA UNLIMITED ON "APEX_WORKSPACE_ER"'; EXECUTE IMMEDIATE SQLSTR;EXCEPTION WHEN OTHERS THEN IF SQLCODE = -30041 THEN SQLSTR := 'SELECT COUNT(*) FROM USER_TABLESPACES WHERE TABLESPACE_NAME = ''APEX_WORKSPACE_ER'' AND CONTENTS = ''TEMPORARY''

Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA

Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE

Processing object type SCHEMA_EXPORT/TABLE/TABLE

ORA-39083: Object type TABLE:"APEX_WORKSPACE_ER"."APEX_ROLE" failed to create with error:

ORA-00959: tablespace 'APEX_WORKSPACE_ER' does not exist

Failing sql is:

CREATE TABLE "APEX_WORKSPACE_ER"."APEX_ROLE" ("ROLE_SID" NUMBER(12,0) NOT NULL ENABLE, "ROLE_NAME" VARCHAR2(30 CHAR) NOT NULL ENABLE, "ROLE_DESCRIPTION" VARCHAR2(100 CHAR) NOT NULL ENABLE, "INSERT_TS" DATE NOT NULL ENABLE, "INSERT_USER_KID" VARCHAR2(30 CHAR) NOT NULL ENABLE, "DELETE_TS" DATE NOT NULL ENABLE, "UPDATE_USER_KID" VARCHAR2(30 CHAR)) SE

ORA-39083: Object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER" failed to create with error:

ORA-00959: tablespace 'APEX_WORKSPACE_ER' does not exist

Failing sql is:

CREATE TABLE "APEX_WORKSPACE_ER"."APEX_USER" ("USER_KID" VARCHAR2(30 CHAR) NOT NULL ENABLE, "USER_NAME" VARCHAR2(100 CHAR) NOT NULL ENABLE, "INSERT_TS" DATE NOT NULL ENABLE, "INSERT_USER_KID" VARCHAR2(30 CHAR) NOT NULL ENABLE, "DELETE_TS" DATE NOT NULL ENABLE, "UPDATE_USER_KID" VARCHAR2(30 CHAR)) SEGMENT CREATION IMMEDIATE PCTFREE 10 PCTUSED 40 IN

ORA-39083: Object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER_ROLE" failed to create with error:

ORA-00959: tablespace 'APEX_WORKSPACE_ER' does not exist

Failing sql is:

CREATE TABLE "APEX_WORKSPACE_ER"."APEX_USER_ROLE" ("USER_KID" VARCHAR2(30 CHAR) NOT NULL ENABLE, "ROLE_SID" NUMBER(12,0) NOT NULL ENABLE, "VALID_FROM" DATE NOT NULL ENABLE, "VALID_TO" DATE NOT NULL ENABLE, "INSERT_TS" DATE NOT NULL ENABLE, "INSERT_USER_KID" VARCHAR2(30 CHAR) NOT NULL ENABLE, "DELETE_TS" DATE NOT NULL ENABLE, "UPDATE_USER_KID"

ORA-39083: Object type TABLE:"APEX_WORKSPACE_ER"."APEX_GENERIC_UPLOAD_DATA_TMP" failed to create with error:

ORA-00959: tablespace 'APEX_WORKSPACE_ER' does not exist

Failing sql is:

CREATE TABLE "APEX_WORKSPACE_ER"."APEX_GENERIC_UPLOAD_DATA_TMP" ("DATA_ITEM_ID" NUMBER(22,0) NOT NULL ENABLE, "TRADE_SID" NUMBER(12,0), "SRC_SYS_TRADE_CODE" NUMBER(22,0), "BOOK_SID" NUMBER(12,0), "BOOK_NAME" VARCHAR2(200 CHAR), "COUNTERPARTY_SID" NUMBER(12,0), "MDM_COUNTERPARTY_ID" NUMBER(12,0), "DIRECTION_TYPE" VARCHAR2(5 CHAR)

ORA-39083: Object type TABLE:"APEX_WORKSPACE_ER"."APEX_REF_ADJUSTMENT_KPI_SUBTYP" failed to create with error:

ORA-00959: tablespace 'APEX_WORKSPACE_ER' does not exist

Failing sql is:

CREATE TABLE "APEX_WORKSPACE_ER"."APEX_REF_ADJUSTMENT_KPI_SUBTYP" ("KPI_TYPE" VARCHAR2(100 CHAR), "KPI_SUBTYPE" VARCHAR2(100 CHAR)) SEGMENT CREATION IMMEDIATE PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST

ORA-39083: Object type TABLE:"APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP" failed to create with error:

ORA-00959: tablespace 'APEX_WORKSPACE_ER' does not exist

Failing sql is:

CREATE TABLE "APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP" ("ADJUSTMENT_ID" NUMBER(12,0) NOT NULL ENABLE, "ADJUSTMENT_TYPE" VARCHAR2(20 CHAR) NOT NULL ENABLE, "IS_ACTIVE_YN" VARCHAR2(1 CHAR), "ADJUSTMENT_SCOPE" VARCHAR2(20 CHAR) NOT NULL ENABLE, "VALID_FROM" DATE NOT NULL ENABLE, "VALID_TO" DATE NOT NULL ENABLE, "VALUATION_VE

ORA-39083: Object type TABLE:"APEX_WORKSPACE_ER"."APEX_STEERING_XGME_TMP" failed to create with error:

ORA-00959: tablespace 'APEX_WORKSPACE_ER' does not exist

Failing sql is:

CREATE TABLE "APEX_WORKSPACE_ER"."APEX_STEERING_XGME_TMP" ("REFERENCE" VARCHAR2(30 CHAR) NOT NULL ENABLE, "DEAL_NUMBER" NUMBER(12,0) NOT NULL ENABLE, "PORTFOLIO" VARCHAR2(30 CHAR) NOT NULL ENABLE, "SOURCE_SYSTEM" VARCHAR2(30 CHAR) NOT NULL ENABLE, "DEAL_GROUP" VARCHAR2(30 CHAR) NOT NULL ENABLE, "FIRM_PR" VARCHAR2(30 CHAR) NOT NULL ENA

Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA

Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT

ORA-39112: Dependent object type OBJECT_GRANT:"APEX_WORKSPACE_ER" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_ROLE" creation failed

ORA-39112: Dependent object type OBJECT_GRANT:"APEX_WORKSPACE_ER" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER" creation failed

ORA-39112: Dependent object type OBJECT_GRANT:"APEX_WORKSPACE_ER" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER_ROLE" creation failed

ORA-39112: Dependent object type OBJECT_GRANT:"APEX_WORKSPACE_ER" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_GENERIC_UPLOAD_DATA_TMP" creation failed

ORA-39112: Dependent object type OBJECT_GRANT:"APEX_WORKSPACE_ER" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP" creation failed

TABLE:"APEX_WORKSPACE_ER"."APEX_REF_ADJUSTMENT_KPI_SUBTYP" creation failed

ORA-39112: Dependent object type COMMENT skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_REF_ADJUSTMENT_KPI_SUBTYP" creation failed

Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_SPEC

Processing object type SCHEMA_EXPORT/FUNCTION/FUNCTION

Processing object type SCHEMA_EXPORT/PACKAGE/COMPILE_PACKAGE/PACKAGE_SPEC/ALTER_PACKAGE_SPEC

Processing object type SCHEMA_EXPORT/FUNCTION/ALTER_FUNCTION

Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX

ORA-39112: Dependent object type INDEX:"APEX_WORKSPACE_ER"."APEX_ROLE_PK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_ROLE" creation failed

ORA-39112: Dependent object type INDEX:"APEX_WORKSPACE_ER"."APEX_USER_PK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER" creation failed

ORA-39112: Dependent object type INDEX:"APEX_WORKSPACE_ER"."APEX_USER_ROLE_PK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER_ROLE" creation failed

ORA-39112: Dependent object type INDEX:"APEX_WORKSPACE_ER"."APEX_GEN_UPLOAD_DATA_TMP_PK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_GENERIC_UPLOAD_DATA_TMP" creation failed

ORA-39112: Dependent object type INDEX:"APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP_PK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP" creation failed

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT

ORA-39112: Dependent object type CONSTRAINT:"APEX_WORKSPACE_ER"."APEX_GEN_UPLOAD_DATA_TMP_PK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_GENERIC_UPLOAD_DATA_TMP" creation failed

ORA-39112: Dependent object type CONSTRAINT:"APEX_WORKSPACE_ER"."APEX_ROLE_PK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_ROLE" creation failed

ORA-39112: Dependent object type CONSTRAINT:"APEX_WORKSPACE_ER"."APEX_USER_PK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER" creation failed

ORA-39112: Dependent object type CONSTRAINT:"APEX_WORKSPACE_ER"."APEX_USER_ROLE_PK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER_ROLE" creation failed

ORA-39112: Dependent object type CONSTRAINT:"APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP_PK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP" creation failed

ORA-39112: Dependent object type CONSTRAINT:"APEX_WORKSPACE_ER"."SYS_C00157591" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_STEERING_XGME_TMP" creation failed

Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS

ORA-39112: Dependent object type INDEX_STATISTICS skipped, base object type INDEX:"APEX_WORKSPACE_ER"."APEX_ROLE_PK" creation failed

ORA-39112: Dependent object type INDEX_STATISTICS skipped, base object type INDEX:"APEX_WORKSPACE_ER"."APEX_USER_PK" creation failed

ORA-39112: Dependent object type INDEX_STATISTICS skipped, base object type INDEX:"APEX_WORKSPACE_ER"."APEX_USER_ROLE_PK" creation failed

ORA-39112: Dependent object type INDEX_STATISTICS skipped, base object type INDEX:"APEX_WORKSPACE_ER"."APEX_GEN_UPLOAD_DATA_TMP_PK" creation failed

ORA-39112: Dependent object type INDEX_STATISTICS skipped, base object type INDEX:"APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP_PK" creation failed

ORA-39112: Dependent object type INDEX_STATISTICS skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_STEERING_XGME_TMP" creation failed

Processing object type SCHEMA_EXPORT/VIEW/VIEW

Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_BODY

ORA-39082: Object type PACKAGE_BODY:"APEX_WORKSPACE_ER"."APEX_AUTHORIZATION_UTIL" created with compilation warnings

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT

ORA-39112: Dependent object type REF_CONSTRAINT:"APEX_WORKSPACE_ER"."APEX_USER_ROLE_APEX_USER_FK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER_ROLE" creation failed

ORA-39112: Dependent object type REF_CONSTRAINT:"APEX_WORKSPACE_ER"."APEX_USER_ROLE_APEX_ROLE_FK" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER_ROLE" creation failed

Processing object type SCHEMA_EXPORT/TABLE/TRIGGER

ORA-39112: Dependent object type TRIGGER:"APEX_WORKSPACE_ER"."MANUAL_ADJUSTMENTS_TMP_INS" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP" creation failed

ORA-39112: Dependent object type TRIGGER:"APEX_WORKSPACE_ER"."GENERIC_UPLOAD_DATA_TMP_INS" skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_GENERIC_UPLOAD_DATA_TMP" creation failed

Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS

ORA-39112: Dependent object type TABLE_STATISTICS skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_ROLE" creation failed

ORA-39112: Dependent object type TABLE_STATISTICS skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER" creation failed

ORA-39112: Dependent object type TABLE_STATISTICS skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_USER_ROLE" creation failed

ORA-39112: Dependent object type TABLE_STATISTICS skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_GENERIC_UPLOAD_DATA_TMP" creation failed

ORA-39112: Dependent object type TABLE_STATISTICS skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_REF_ADJUSTMENT_KPI_SUBTYP" creation failed

ORA-39112: Dependent object type TABLE_STATISTICS skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP" creation failed

ORA-39112: Dependent object type TABLE_STATISTICS skipped, base object type TABLE:"APEX_WORKSPACE_ER"."APEX_STEERING_XGME_TMP" creation failed

Job "OSAORACLE"."SYS_IMPORT_FULL_01" completed with 148 error(s) at Sat Jun 13 09:52:16 2015 elapsed 0 00:00:03

So now i create the tablespace (note OMF so no messy file names etc....)

SQL> create tablespace APEX_WORKSPACE_ER;

Tablespace created.

Now i do the import again which throws a lot of ignorable errors as the initial import only partially worked

impdp / directory=tmp dumpfile=miracle.dmp

Import: Release 11.2.0.4.0 - Production on Sat Jun 13 09:52:39 2015

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning option

Master table "OSAORACLE"."SYS_IMPORT_FULL_01" successfully loaded/unloaded

Starting "OSAORACLE"."SYS_IMPORT_FULL_01": /******** directory=tmp dumpfile=miracle.dmp

Processing object type SCHEMA_EXPORT/USER

ORA-31684: Object type USER:"APEX_WORKSPACE_ER" already exists

Processing object type SCHEMA_EXPORT/SYSTEM_GRANT

Processing object type SCHEMA_EXPORT/ROLE_GRANT

Processing object type SCHEMA_EXPORT/DEFAULT_ROLE

Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA

Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA

Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE

ORA-31684: Object type SEQUENCE:"APEX_WORKSPACE_ER"."APEX_STEERING_XGME_TMP_SEQ" already exists

Processing object type SCHEMA_EXPORT/TABLE/TABLE

Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA

. . imported "APEX_WORKSPACE_ER"."APEX_MANUAL_ADJUSTMENTS_TMP" 17.64 KB 2 rows

. . imported "APEX_WORKSPACE_ER"."APEX_GENERIC_UPLOAD_DATA_TMP" 24.50 KB 8 rows

. . imported "APEX_WORKSPACE_ER"."APEX_REF_ADJUSTMENT_KPI_SUBTYP" 5.648 KB 6 rows

. . imported "APEX_WORKSPACE_ER"."APEX_ROLE" 8.390 KB 10 rows

. . imported "APEX_WORKSPACE_ER"."APEX_STEERING_XGME_TMP" 11.78 KB 40 rows

. . imported "APEX_WORKSPACE_ER"."APEX_USER" 8.726 KB 34 rows

. . imported "APEX_WORKSPACE_ER"."APEX_USER_ROLE" 12.89 KB 96 rows

Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT

Processing object type SCHEMA_EXPORT/TABLE/COMMENT

Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_SPEC

ORA-31684: Object type PACKAGE:"APEX_WORKSPACE_ER"."APEX_AUTHORIZATION_UTIL" already exists

Processing object type SCHEMA_EXPORT/FUNCTION/FUNCTION

ORA-31684: Object type FUNCTION:"APEX_WORKSPACE_ER"."F_CHECK_DATE_FORMAT" already exists

ORA-31684: Object type FUNCTION:"APEX_WORKSPACE_ER"."F_REGISTER_TARGET" already exists

ORA-31684: Object type FUNCTION:"APEX_WORKSPACE_ER"."F_START_LOG" already exists

ORA-31684: Object type FUNCTION:"APEX_WORKSPACE_ER"."F_STOP_LOG" already exists

Processing object type SCHEMA_EXPORT/PACKAGE/COMPILE_PACKAGE/PACKAGE_SPEC/ALTER_PACKAGE_SPEC

Processing object type SCHEMA_EXPORT/FUNCTION/ALTER_FUNCTION

Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT

Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS

Processing object type SCHEMA_EXPORT/VIEW/VIEW

ORA-31684: Object type VIEW:"APEX_WORKSPACE_ER"."APEX_BOOK_INST" already exists

Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_BODY

ORA-31684: Object type PACKAGE_BODY:"APEX_WORKSPACE_ER"."APEX_AUTHORIZATION_UTIL" already exists

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT

Processing object type SCHEMA_EXPORT/TABLE/TRIGGER

Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS

Job "OSAORACLE"."SYS_IMPORT_FULL_01" completed with 9 error(s) at Sat Jun 13 09:52:43 2015 elapsed 0 00:00:03

And everything is back as it should be.

Martini's all round i think.......

There is a bug somewhere with flashback/controlfiles/incarnations - it's a specific set of circumstances that seem to create it and we still haven't resolved what that is - but anyway the technique above may help someone out of an hole - and anyway its quite an interesting battle......

↧

Cloud control and the SLES problem

June 16, 2015, 1:55 pm

≫ Next: The triangle of doom

≪ Previous: The battle of Thermopylae (well rman and flashback anyway)

As some of you may have read I've been investing some time in making use of cloud control as a CMDB (or something CMDB like at least) - everyone has a CMDB right.....?

Well anyway we have been arranging with our os support teams to patch SLES10 and SLES 11 up to the latest patchset across all the estate. I wanted to use cloud control to validate what actually needed to be done and i thought this information would already be gathered - and it is..... just not completely - the version information is just too long.....see the screenshot below

I had hoped this was just a screen display thing and that the actual data stored was 'OK' - but it wasn't this is the collected value which is of no use to me.....

So what to do?

A metric extension is called for.

Now i've covered all the basics in a lot of details before - see here so I'm just amending that slightly to gather the os version info.

The key important screen to share is this one - the rest I'm sure you can work out for yourselves

The key elements are:

1) Command set to "ksh"
2) script set to version.ksh
3) the contents of version.ksh are displayed in the popup - this simply takes the 3 lines present in /etc/SuSE-release and joins them into a single line using the xargs trick - this tiny script is deployed with the metric - nothing manual to do here.

So now the metric will collect the exact version information and store it in the OMS repo once i publish the metric extension to all my SLES hosts.

So i do that and then a short while later (not sure of when the exact timings are of the background aggregation jobs that cloud control runs) the info is queryable using my previously created SQL for other metrics (the data can be seen real time by browsing metric extensions for the host directly in the web page).

In my case i called the metric sles-version in the definition so that is what i need to query on

SELECT target_name, value, last_collection_time
FROM sysman.GC_METRIC_STR_VALUES_latest mv, MGMT_TARGETS MT
WHERE ENTITY_GUID = MT.Target_Guid
AND METRIC_COLUMN_NAME = 'sles-version'

This produces results similar to the below (i hid the target names in this example)

Note also the use of the last_collection_time value - this is important as it proves the data is current - it would be easy to collect the info once and then something breaks and you wouldn't necessarily realize the data was very out of date.

I guess this little feature will be fixed in the main code line at some point to not truncate - but the tool is so powerful i could just fix and collect what i wanted with a metric extension in a few minutes.

This illustrates again how powerful metric extensions are.

↧

The triangle of doom

June 18, 2015, 9:15 am

≫ Next: 12c dataguard issue follow up

≪ Previous: Cloud control and the SLES problem

Occasionally when you look at the performance screen in cloud control you get a surprise - but i wonder how many of you have seen a screen like this.....

Pretty daunting...

Clicking on the concurrency link to the right shows this

Which is only slightly less alarming :-)

The strangest thing when i first looked at this was the fact that all of the usernames were showing as SYS and the connection was clearly from our application server from the program name and we definitely are not using SYS in the configuration - very strange.

SO i logged on to sqlplus to look at the basic info - which showed me this from v$session (and lots of rows all with the same info)

USERNAME PROGRAM SQL_ID
------------------------------ ------------------------------------------------ -------------
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne
com.mchange.v2.async.ThreadPoolAsynchronousRunne

So a null username ( which i guess causes the SYS display in cloud control) and no sql_id running.

Very odd

Lets have a look at the listener log

And we see lots of what look like successful messages - at least the listener thinks its done its job

18-JUN-2015 14:54:16 * (CONNECT_DATA=(SID=DB)(SERVER=DEDICATED)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=APPSERVER$))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56362)) * establish * DB * 0
18-JUN-2015 14:54:16 * (CONNECT_DATA=(SID=DB)(SERVER=DEDICATED)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=APPSERVER$))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56363)) * establish * DB * 0
18-JUN-2015 14:54:16 * (CONNECT_DATA=(SID=DB)(SERVER=DEDICATED)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=APPSERVER$))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56365)) * establish * DB * 0
18-JUN-2015 14:54:16 * (CONNECT_DATA=(SID=DB)(SERVER=DEDICATED)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=APPSERVER$))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56364)) * establish * DB * 0
18-JUN-2015 14:54:16 * (CONNECT_DATA=(SID=DB)(SERVER=DEDICATED)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=APPSERVER$))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56366)) * establish * DB * 0
18-JUN-2015 14:54:16 * (CONNECT_DATA=(SID=DB)(SERVER=DEDICATED)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=APPSERVER$))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56367)) * establish * DB * 0
18-JUN-2015 14:54:16 * (CONNECT_DATA=(SID=DB)(SERVER=DEDICATED)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=APPSERVER$))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56368)) * establish * DB * 0
18-JUN-2015 14:54:16 * (CONNECT_DATA=(SID=DB)(SERVER=DEDICATED)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=APPSERVER$))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56369)) * establish * DB * 0
18-JUN-2015 14:54:16 * (CONNECT_DATA=(SID=DB)(SERVER=DEDICATED)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=APPSERVER$))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56370)) * establish * DB * 0
18-JUN-2015 14:54:16 * (CONNECT_DATA=(SID=DB)(SERVER=DEDICATED)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=APPSERVER$))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56371)) * establish * DB * 0

I also then checked the alert log and saw messages like this repeated many times

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 12.1.0.1.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 12.1.0.1.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 12.1.0.1.0 - Production
Time: 18-JUN-2015 14:56:46
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.10)(PORT=56097))
WARNING: inbound connection timed out (ORA-3136)

So it looks like the client is not providing the connection info or time or there is just some network timeout or the process is just taking too long.

From the application server they report that the connection is not working so i logged on to that to see if there were any clues there

Firing up sqlplus i try this

I get a long pause and then the error you see above

However and most bizarrely of all if i choose any user apart from the application login it connects instantly!

stranger and stranger.....

So that rules out pretty much everything other than a problem with that account.....

Then it emerged that the password had been changed earlier that day

So i thought OK i'll try changing it again just to see what happens - and the command just hung - in the end i gave up and killed it.

OK - lets try a complete stop of the app server and a database restart.

So we do that and then sqlplus connections work fine and all appears OK - then we start the appserver and everything breaks again!

So now i'm thinking the appserver must have the wrong password and is in some kind of feedback loop and is swamping the database with so many requests it loses the plot - there is probably an appserver log somewhere full of errors but i don't know where to look for that.

Anyway i track down the password config screen and change the password (with the app completely stopped). Then i bring up the app again and......

It all works fine and the triangle of doom has vanished.

I'm not sure exactly why oracle behaves that way - i don't know if multiple requests have just overwhelmed it (although connections as other users worked fine) or if essentially the user$ row is locked out in some way by the volume of invalid connection requests.

Anyway an interesting one....

↧

12c dataguard issue follow up

June 22, 2015, 2:02 am

≫ Next: Is it possible to cause tables to be stale with only tiny amounts of change?

≪ Previous: The triangle of doom

I posted a couple of months ago about an issue i was having with 12c dataguard not working as i expected - see here

I then found my own workaround for it - see here

However the original ticket i raised with support came back suggesting that this 'manual' method of controlling services should not be used and the changes that have happened in this area have been done deliberately because of CDB/PDB architecture changes.

I wasn't happy about this so i contacted someone at Oracle who would have the inside line about what was happening here.

He replied back and actually was not able to reproduce the issue, he was running with Oracle restart though (and even though the service was not created through restart i thought perhaps this was having some effect).

Anyway i tried the test case again on an installation i had of restart and it did seem to work fine but i was still a little confused - why should this really make any difference?

So i tried the test case again on a 'traditional' database and guess what it worked! ??????

What happened?

Well at first i couldn't believe it but then i went back and compared the test cases - there was a difference....

The original non working test case used this block of code

begin
dbms_service.create_service( service_name => 'FRED',
network_name => ' FRED',
failover_method => 'BASIC',
failover_type => 'SELECT',
failover_retries => 180,
failover_delay => 1);
end;
/

The new one just this (with no TAF settings)

begin
dbms_service.create_service( service_name => 'FRED',
network_name => ' FRED');
end;
/

So now we are getting somewhere - and if we look at the doc page for this proc we get a hint maybe why https://docs.oracle.com/database/121/ARPLS/d_serv.htm#ARPLS68020

There are 2 versions of the code the original one and a new overloaded one that seems to be the one that should be being used.

So lets try out the new way of setting up a TAF service

So we create the basics of the service with minimal parameters

exec DBMS_SERVICE.CREATE_SERVICE('FRED2','FRED2');

Then we build an array of parameters to pass in to modify service

DECLARE
params dbms_service.svc_parameter_array;
BEGIN
params('FAILOVER_METHOD') :='BASIC';
params('FAILOVER_TYPE') :='SELECT';
params('FAILOVER_DELAY') :=1;
params('FAILOVER_RETRIES') :=180;
DBMS_SERVICE.MODIFY_SERVICE('ELHUBDG',params);
END;
/

Then we start it

exec dbms_service.start_service('FRED2');

And guess what it works!

So it seems the DBMS_SERVICE call in 12c has a bug unless you use the new one - so use the new one!

If you do use the new one then the startup trigger can just stay as it is. However i/we should be using oracle restart to do this and not triggers as i was told by my oracle contact - and before you say "but its deprecated" - take a look at this note on MOS Doc ID 1584742.1

↧

Is it possible to cause tables to be stale with only tiny amounts of change?

June 25, 2015, 2:06 pm

≫ Next: When FRA maths goes wrong

≪ Previous: 12c dataguard issue follow up

I don't really like posting blog entries about Oracle statistics as so many other people do it better than i ever could (Jonathan, Randolf etc) - well that and i'm likely to get it wrong...

However i've been looking in to something today and i think it's worth sharing the results as it's quite interesting.

We have a large datawarehouse that's having stats issues (par for the course pretty much), i won't go in to why the stats gathering policy is as we have in place at the moment but suffice to say the stats jobs are manually called as part of the ETL processing but they run with the 'GATHER AUTO' and 'GATHER STALE' options - this in itself isn't generally an issue in most cases but we have some tables where a small % data change is significant in terms of data distribution (i.e. we are essentially adding new 'out of range' data to some of the tables) - this gives the optimizer a hard time and it resorts to some guesses - and of course that usually means it's wrong at least some of the time.

Changing the process is possible of course but the way it's built into the core of the system makes this a non trivial task - so what else can we do here without having to touch the code?

In the case of at least one table we pretty much want it to gather stats every day as part of the ETL process - the current hardcoded job will only do this if the table is considered 'stale' so what can we do about that?

The default 'staleness' is 10% so i wondered can we make this much smaller (one of my colleagues suggested 1% and indeed some posts i read seemed to state this as a minimum) - i wanted to go smaller so i just tried the plsql to reduce it to less than 1

begin
dbms_stats.set_table_prefs(USER, 'DEMO','STALE_PERCENT', '0.0001');
end;
/

And it quite happily accepted it - but that doesn't mean it will work of course - so i went about a test.

Lets first create a demo user, log on as that and create a table with a million rows (to make the maths easy) - i tried using the connect by row generator trick but it didn't seem to like large values so i reverted to plsql

SQL> create user demo identified by demo;

User created.

SQL> grant dba to demo;

Grant succeeded.

SQL> conn demo/demo
Connected.
SQL> create table demo (col1 number);

Table created.

SQL>

insert into demo
SELECT LEVEL
FROM dual
CONNECT BY LEVEL <= 10000001

SQL> insert into demo
SELECT LEVEL
FROM dual
CONNECT BY LEVEL <= 10000001 2 3 4
5 /
insert into demo
*
ERROR at line 1:
ORA-30009: Not enough memory for CONNECT BY operation

So revert to plsql here to generate the rows

Elapsed: 00:00:07.57
SQL>

1 declare
2 i number;
3 begin
4 for i in 1..1000000
5 loop
6 insert into demo values(i);
7 end loop;
8* end;
SQL> /

PL/SQL procedure successfully completed.

Elapsed: 00:00:28.68

SQL> select count(*) from demo;

COUNT(*)
----------
1000000

Elapsed: 00:00:00.02
SQL> commit;

So now we have a single column table called demo with a million rows in it.

Lets gather stats and that and check the figures

SQL> exec dbms_stats.gather_table_stats(USER,'DEMO');

PL/SQL procedure successfully completed.

SQL> select table_name,num_rows,blocks from user_tables;

TABLE_NAME NUM_ROWS BLOCKS

------------------------------ ---------- ----------

DEMO 1000000 7048

All as we would expect - lets now show the current staleness state

SQL> select TABLE_NAME,STALE_STATS from user_tab_statistics;

TABLE_NAME STALE_STATS

------------------------------ ------------

DEMO NO

And everything is 'fresh'

Now lets update 10% of the rows

SQL> update demo set col1=col1 where col1 < 100001

2 /

100000 rows updated.

SQL> commit;

Commit complete.

Is it now 'stale'?

SQL> select TABLE_NAME,STALE_STATS from user_tab_statistics;

TABLE_NAME STALE_STATS

------------------------------ ------------

DEMO NO

Hmm no - and the reason is that monitoring info is held in some buffer and only flushed to disk every so often (not sure of the schedule)

We can force it though - so lets do that

SQL> exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO;

PL/SQL procedure successfully completed.

SQL> select TABLE_NAME,STALE_STATS from user_tab_statistics;

TABLE_NAME STALE_STATS

------------------------------ ------------

DEMO YES

OK - so that works nicely, but at this point i then started to think there was a fatal flaw in the plan - how am i going to make the flush job run when i want it to? Anyway i ignored that for now and carried on with the test.

Now i set the stale_percent to one thousandth of a percent (1 row in this case)

begin

dbms_stats.set_table_prefs(USER, 'DEMO','STALE_PERCENT', '0.0001');

end;

So now lets try a test (i update a few rows more than just one to account for any rounding or something that might be going on)

SQL> update demo set col1=col1 where col1 <5;

4 rows updated.

I now flush the stats and check

SQL> exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO;

PL/SQL procedure successfully completed.

SQL> select TABLE_NAME,STALE_STATS from user_tab_statistics;

TABLE_NAME STALE_STATS

------------------------------ ------------

DEMO YES

Brilliant - so the new setting is working - changing just 4 rows made the whole million row table 'stale'.

Great apart from one thing - i had to manually flush the monitoring info - how was i going to deal with that?

Well some more reading revealed that supposedly dbms_stats calls this automatically when run for 'GATHER AUTO' of 'GATHER STALE' - which is just what i want - so lets try that.

I quickly change the data format so i can see the timestamp easily on when stats were gathered

SQL> alter session set nls_date_format='dd-mon-yyyy hh24:mi:ss';

Session altered.

I gather the stats now to remove any staleness

SQL> exec dbms_stats.gather_table_stats(USER,'DEMO');

PL/SQL procedure successfully completed.

Then i change a few rows

SQL> update demo set col1=col1 where col1 <5;

4 rows updated.

SQL> commit;

Commit complete.

Confirm the current timestamp of when the stats were gathered

SQL> select table_name,last_analyzed from user_tables;

TABLE_NAME LAST_ANALYZED

------------------------------ -----------------------------

DEMO 25-jun-2015 22:31:27

Now we run the stats job that our ETL batch would run in this case - calling gather_schema_stats with a filter for the single table and using the 'GATHER AUTO' option.

SQL> DECLARE

2 filter_lst DBMS_STATS.OBJECTTAB := DBMS_STATS.OBJECTTAB();

3 BEGIN

4 filter_lst.extend(1);

filter_lst(1).ownname := 'DEMO';

5 6 filter_lst(1).objname := 'DEMO';

7 DBMS_STATS.GATHER_SCHEMA_STATS(NULL, obj_filter_list => filter_lst, options => 'GATHER AUTO');

8 END;

9 /

PL/SQL procedure successfully completed.

That ran OK - so lets check...... (high tension at this point.....)

SQL> select table_name,last_analyzed from user_tables;

TABLE_NAME LAST_ANALYZED

------------------------------ -----------------------------

DEMO 25-jun-2015 22:32:10

And it's worked! So we have a possible short term solution - but longer term we need to be more explicit in the ETL about exactly what to gather and not rely on % of change as a reason for gathering new stats.

Apologies for the picture by the way it was the closest to 'stale bread' i could find.......

↧

When FRA maths goes wrong

July 16, 2015, 5:08 am

≫ Next: Wait event bingo

≪ Previous: Is it possible to cause tables to be stale with only tiny amounts of change?

Stop press - I'm now on twitter thanks to peer group pressure (Tim and Chris you know who you are) so I'll hopefully be tweeting all my blog entries from now on (as long as i remember to do it...)

https://twitter.com/dbaharrison

Anyway - back to the blog

It's been a while since my last post (just been too busy with work to write anything up) but the issue we've had over the past couple of days i think warrants making the extra effort.

The post concerns the FRA (fast/flash recovery area - depending on your preference) on one of our 12.1.0.2 databases - specifically the cloud control repository - though the application is really irrelevant in this case.

For many years now we've used the FRA to store all of our archivelogs (regardless of whether flashback is actually enabled or not) - it's just cleaner that way and less maintenance overhead. We had a few teething issues initially (mainly understanding exactly how the mechanics of it worked) but for a long time now we've had no issues - that was until yesterday.

Where we saw this

Errors in file /oracle/admin/DB/diag/rdbms/DB/DB/trace/DB_arc1_29062.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 19327352000 bytes is 46.97% used, and has 10248593920 remaining bytes available.
************************************************************************
You have following choices to free up space from recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
system command was used to delete files, then use RMAN CROSSCHECK and
DELETE EXPIRED commands.
************************************************************************
Errors in file /oracle/admin/DB/diag/rdbms/DB/DB/trace/DB_arc1_29062.trc:
ORA-19809: limit exceeded for recovery files
ORA-19804: cannot reclaim 393922560 bytes disk space from 19327352000 limit

Initially we thought this was just some mismatch issue - the disk is clearly full as the system cannot create an archive log but the FRA is only 46.97% full.

This can generally happen in 2 ways:

1. The FRA is sized bigger than the actual physical disk - so oracle thinks more space is available than there actually is
2. Files that are non database files (or db files that the db does not know about) are taking up space in the FRA - so oracle thinks there is more space available than there actually is

In our case neither of these was true - in fact v$flash_recovery_area_usage clearly showed that it thought the area was indeed full. This did not match with the error in the alert log from the arc process that it thought that the area was only 47% full. As it was only 47% full it hadn't triggered the auto delete of old backed up files - and even the out of space condition also hadn't triggered it.

So what was going on?

Well.... after a lot of false leads - including removing the dataguard attached to this db and sizing the FRA up and down in a vain attempt to make oracle 'wake up' and realize what was going on we finally decided this looks like a bug...

So we crawled through MOS (and google) and while there is often reference to this error - all the causes we had were ruled out or didn't apply in 12.1.0.2.

So what next - well as i know that a lot of this information is in the controlfile and that is likely where arc is checking against for the internal check we decided to shutdown the db and recreate the controlfiles.

so we ran that:

Completed: CREATE CONTROLFILE REUSE DATABASE "DB" NORESETLOGS FORCE LOGGING ARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 2
MAXDATAFILES 30
MAXINSTANCES 1
MAXLOGHISTORY 1168
LOGFILE
GROUP 1 (
'/oracle/DB/oradata/DB/onlinelog/o1_mf_1_81q5cfpz_.log',
'/oracle/DB/recovery_area/DB/onlinelog/o1_mf_1_81q5cltn_.log'
) SIZE 400M BLOCKSIZE 512,
GROUP 2 (
'/oracle/DB/oradata/DB/onlinelog/o1_mf_2_81q5bdfb_.log',
'/oracle/DB/recovery_area/DB/onlinelog/o1_mf_2_81q5bkgr_.log'
) SIZE 400M BLOCKSIZE 512,
GROUP 3 (
'/oracle/DB/oradata/DB/onlinelog/o1_mf_3_81q59rnp_.log',
'/oracle/DB/recovery_area/DB/onlinelog/o1_mf_3_81q59xwl_.log'
) SIZE 400M BLOCKSIZE 512
-- STANDBY LOGFILE
-- GROUP 4 '/oracle/DB/oradata/DB/onlinelog/standby_redo01.log' SIZE 400M BLOCKSIZE 512,
-- GROUP 5 '/oracle/DB/oradata/DB/onlinelog/standby_redo02.log' SIZE 400M BLOCKSIZE 512,
-- GROUP 6 '/oracle/DB/oradata/DB/onlinelog/standby_redo03.log' SIZE 400M BLOCKSIZE 512,
-- GROUP 7 '/oracle/DB/oradata/DB/onlinelog/standby_redo04.log' SIZE 400M BLOCKSIZE 512
DATAFILE
'/oracle/DB/oradata/DB/datafile/o1_mf_system_76gbqhq1_.dbf',
'/oracle/DB/oradata/DB/datafile/o1_mf_sysaux_76gbqkxk_.dbf',
'/oracle/DB/oradata/DB/datafile/o1_mf_sys_undo_76gbqnxl_.dbf',
'/oracle/DB/oradata/DB/datafile/mgmt_ecm_depot1.dbf',
'/oracle/DB/oradata/DB/datafile/mgmt.dbf',
'/oracle/DB/oradata/DB/datafile/mgmt_ad4j.dbf',
'/oracle/DB/oradata/DB/datafile/o1_mf_eetpptdb_76gv4vnn_.dbf',
'/oracle/DB/oradata/DB/datafile/o1_mf_xdb_76xz4r6w_.dbf',
'/oracle/DB/oradata/DB/datafile/o1_mf_apex_76y18r5m_.dbf',
'/oracle/DB/oradat
2015-07-16 10:46:59.202000 +01:00
ALTER DATABASE RECOVER DATABASE
Media Recovery Start
Started logmerger process
Parallel Media Recovery started with 8 slaves
Recovery of Online Redo Log: Thread 1 Group 1 Seq 30198 Reading mem 0
Mem# 0: /oracle/DB/oradata/DB/onlinelog/o1_mf_1_81q5cfpz_.log
Mem# 1: /oracle/DB/recovery_area/DB/onlinelog/o1_mf_1_81q5cltn_.log
2015-07-16 10:47:06.158000 +01:00
Media Recovery Complete (DB)
Completed: ALTER DATABASE RECOVER DATABASE

*** Note here - need to see what is going on with redo logs here - there seems to be the syntax "BLOCKSIZE 512" added - is that a new 12c thing? ****

So we brought everything back up - now the FRA views show no archive logs (as the area has been wiped essentially)

so we run

catalog recovery area;

This rediscovers whats in the FRA and updates the db/controlfile.

So those figures look correct (but they always did) - but what about the background process that's deleting things?

We shrink the FRA down to trigger the process (i think this threshold is 85% full but we shrunk it to a size that would make it 90% full)

ALTER SYSTEM SET db_recovery_file_dest_size=16G SCOPE=BOTH;

and.....

it works! it realizes it's 90% full and triggers the deletion of old backed up logs

case solved!

No idea what caused this weird corruption - but anyway we have a fix now - hopefully its just a one off but if you do get it at least you have a fix now.....

↧

Wait event bingo

July 17, 2015, 8:31 am

≫ Next: cloud control art?

≪ Previous: When FRA maths goes wrong

How this for a random selection of wait events, some of these i never even heard of before

What seemed to have happened is that something had died and smon had decided to do parallel transaction recovery - this seemed to coinicide just around the time there was an out of space condition with the archive area which only last 60 seconds at most. The database returned to normal operation but not everything was working normally.

Smon seemed very confused and seemed to think that the arch area was still full

But the wait event shown about the smon session seemed to disagree with the graphic display

Anyway after looking at the screen for a while smon suddenly seemed to wake up and smell the coffee and everything kicked into life - at the same point this appeared in the alert log

2015-07-17 15:43:36.998000 +01:00
SMON: Parallel transaction recovery tried

Anyway - looks like there is some kind of bug here and smon wasn't waking up and realizing the issue. Sorted itself in the end with no intervention - strange one though.

↧

cloud control art?

July 17, 2015, 8:41 am

≫ Next: Partitioning to the rescue

≪ Previous: Wait event bingo

I'm thinking of starting a gallery for cloud control images - here is the first painting in as you walk through the door

I call this "tidal wave hits redwood shores"

Other submissions gratefully received

↧

Partitioning to the rescue

July 19, 2015, 2:22 am

≫ Next: Database migration with datapump 10g->12c

≪ Previous: cloud control art?

One of our core 3rd party applications has some functionality where results of certain calculations are extracted once per day into customer defined tables for use by downstream systems and interfaces. These results (and there are various different calculations outputting to various different tables) are retained based on business requirements for periods of either a week, a month or sometimes up to 6 months or a year. As the retention length is passed the oldest day is deleted as part of the extraction process.

(that sounded really dull when i read it back..... - we basically have a table that contains a rolling date range of data)

This all works fine but over time and with increasing data volumes the performance degrades, an index on the extraction date helps initially (all interfaces are always querying based on this extract date by the way) but over time the constant insert an entire day of data and delete an entire day of data (to maintain the retention history) ends up pretty much destroying the index. This is one of the few cases where rebuilding indexes actually really is required.

This index rebuild helps but then again over time performance tails off.

There has to be a better way to deal with this - and indeed there is - lets attack this with partitioning.

But first up lets show the current system in action - here is the delete statement removing the out of range earliest date

And that same delete in SQL monitoring (i love this feature by the way)

Anyway you get the idea - the estimate of stats is wrong as you can see from the sql monitoring screen - maybe a histogram would help in correcting the numbers but the plan would be the same - you can see its taking ages.

The story is a similar one for the insert - it's an incredibly simple plan (as you would expect) it just takes forever because of the index maintenance.

OK, so lets sort this - interval partitioning to the rescue.

In this case we want a partition per day as that fits in with the pattern of creation and deletion and we also want to subpartition by list on an additional column as the batch also always chooses from a set series of values and downstream systems query on this - an ideal candidate for partitioning

Oh - and we also want to implement this with no outage as the table is in use all the time, just to make it a little bit more tricky.....

So we have to use DBMS_REDEFINITION for this - but actually in this very simple case that's not difficult at all.

The first thing to do is work out the syntax for the CTAS statement to create an empty composite interval/list partitioned table - this is easier send than done - in fact the most useful thing about this post may be the syntax for doing this :-)

So here it is

create table TARGETSCHEMA.uepf_redef
PARTITION BY RANGE (business_date)
INTERVAL(NUMTODSINTERVAL (1, 'DAY'))
SUBPARTITION BY LIST(use_case)
SUBPARTITION TEMPLATE(
Subpartition Sp1 Values('FP_BS_SUP_L_PE_0'),
Subpartition Sp2 Values('Pre'),
Subpartition Sp3 Values('Post'),
Subpartition Sp4 Values('FP_BS_ALL_L_PW_0'),
Subpartition Sp5 Values('FP_BS_SAL_P_PE_0'),
Subpartition Sp6 Values('FP_BS_SUP_P_PE_0'),
Subpartition Sp7 Values('FP_BS_CAP_P_PE_0'),
Subpartition Sp8 Values('FP_BS_SAL_P_PE_1'),
Subpartition Sp9 Values('Scd'),
Subpartition Sp10 Values('FP_BS_SAL_L_PE_0'),
Subpartition Sp11 Values('FP_BS_SAL_P_PE_2'),
Subpartition Sp12 Values('FP_BS_SUP_P_PE_1'),
Subpartition Sp17 Values(Default)
)
(
PARTITION p1 VALUES LESS THAN (to_date('11/06/2015','dd/mm/yyyy'))
)
as select * from TARGETSCHEMA.user_eet_position_feed where 1=0;

So at a top level the table is partitioned by business_date and we will end up with a separate partition per day, each one of these partitioned is then subpartitioned by 17 different values (including the default to capture anything not listed).

So that's 17 partitions being created every day.

Note here - the first partition date chosen is important - any values less than this all end up in the first partition - the interval routine does not create individual partitions prior to the first partition - this caught me out on the first attempt i had at this.

So now we have an empty table but with the correct definition - now we need to sync the data over and swap this table with the original one - some quick redef steps then

Do initial sync of data - we have no PK so rowid has to be used

EXEC DBMS_REDEFINITION.start_redef_table('TARGETSCHEMA', 'USER_EET_POSITION_FEED', 'UEPF_REDEF',options_flag=>DBMS_REDEFINITION.CONS_USE_ROWID);

Run any grants that exist on the original table again this new one - i.e. grant xxx on UEPF_REDEF to whoever;

Gather stats on this new table

EXEC DBMS_STATS.GATHER_table_STATS (OWNNAME => 'TARGETSCHEMA', TABNAME => 'UEPF_REDEF',GRANULARITY=>'ALL',estimate_percent=>null,degree=>8);

Now everything is ready (and in this simple case i have no FK/PK and no indexes at all - i'll come to that in a minute).

So we switch things over

EXEC DBMS_REDEFINITION.finish_redef_table('TARGETSCHEMA', 'USER_EET_POSITION_FEED', 'UEPF_REDEF');

Now the dictionary has been changed and the table names swapped - so now our active application table is composite partitioned with no outage.

So lets now test what the SQL plan looks like

For the same delete statement (well modified dates/use case - but same data volume and data distribution) we now get this

And in SQL monitoring we now see this

And we can see the whole thing took only a couple of minutes to delete - where the other one was still merrily going for 10 minutes or more. Note that the stats are also very good as we now have all of the data in question in one subpartition.

And back to the index point - it's simply not required - partition pruning is performing the same function in a much better way and the index just gets in the way. Due to the way the data is inserted/queried it's never going to be useful to have an index.

Beware simple tests though - a count(*) on this new table will perform better with an index (using the index as a skinny version of the table) - however in normal application access this statement would never run so the test is not valid and may mislead you.

That's a great result - with no application code changes we improved the system substantially. The insert is also improved as there is no index maintenance to be done on the huge index - so we help both phases of the batch job.

The next stage is to now replace the delete with a drop partition statement

We tested this already and the statements below work very well

alter table xx set interval ();
alter table xx drop partition P1; -- where P1 is the oldest
alter table xx set interval (NUMTODSINTERVAL (1, 'DAY'));

The first statement is a little quirk as you can't drop the first partition directly (at least not in 11.2), so you have to disable interval partitioning and then reenable it after the drop.

This reduces the delete step to almost no time at all - however to implement this we need an application code change which will of course take longer.

A nice demo of some oracle features which solve an application issue quite elegantly i think you'll agree.

↧

Database migration with datapump 10g->12c

July 21, 2015, 12:08 pm

≫ Next: Do you pass the AUTHID test.....?

≪ Previous: Partitioning to the rescue

Finally one of our 'legacy' systems is being brought up to date after many years of languishing on oracle 9. We're so far behind that a migration route is not straightforward (at least one that runs in a reasonable timeframe) so we are left with this convoluted route (there are reasons for this but i won't go into that for now - in fact i'm just going to cover one of these sections).

So general plan is:

1. create standby from 9i db to another server
2. Break standby link - leaving original db shutdown and untouched
3. Activate standby copy and upgrade to 10.2
4. Datapump over network link from 10.2 to precreated 12.1 shell db
5. Run app upgrade in 12.1

So there are a lot of steps there (and in fact a huge amount of complexity with interlinked systems so the actual process is quite horrendous.

The bit i'm going to talk about for now though is step 4 - the datapump extract - there are maybe some useful elements that other people can take away from this:

So before the datapump process can begin i need to build an empty shell database - for completeness I've mentioned that here - we use a minimal config and omf to keep things really simple so don't be surprised if this section is quite short....

(I'm using 12.1.0.1.5 by the way - the max version supported by the 3rd party Vendor)

To start with i create an init file with this content (and add an entry into oratab for this db)

*.compatible='12.1.0.1'
*.db_create_file_dest='/oracle/RICHLINE/oradata/RICHLINE/oradata'
*.db_name='RICHLINE'
*.db_recovery_file_dest='/oracle/RICHLINE/oradata/RICHLINE/recovery_area'
*.db_recovery_file_dest_size=322122547200
*.diagnostic_dest='/oracle/admin/RICHLINE'
*.job_queue_processes=0
*.LOG_ARCHIVE_DEST_1='LOCATION=USE_DB_RECOVERY_FILE_DEST'
*.sga_max_size=6442450944
*.sga_target=6442450944

I then do the following to create the base db

SQL> startup nomount -- no db to actually do anything with yet - this just allocates mem and proc
SQL> create spfile from pfile; -- make sure we have spfile so control files etc get updated
SQL> shutdown; -- needed to read spfile
SQL> startup nomount;
SQL> create database character set WE8ISO8859P1; --created

Database created.

That's it - surprisingly short (as I've mentioned in a previous post). Now we just have to run catalog/catproc which i won't paste here.....

So now we have a working database - albeit with nothing in

I now we create some user tablespaces (i base this on what exists in the source db - i'm not going to let datapump do the tablespaces - it's cleaner to do it manually rather than messing with conversion routines)

With OMF that's just

create tablespace xx;
create temporary tablespace temp;

which will create a file with maxextend to 32G and if you need more than 32GB you just add another file via

alter tablespace xx add datafile;

So now we have somewhere to put our data we now need to crank up datapump - in my case I'm going to pull everything over a db link and not bother creating a file, copying it over and then importing - this means the import is starting to load data as soon as you start the job.

To enable this i need a db link to my source db from this 12c db - so

create database link sourcedb connect to x identified by y using 'sourcetnsentry';

once i have that i can create my datapump parfile (and parfile is better here to avoid escape character nightmares)

so here it is:

NETWORK_LINK=SOURCEDB
full=y
PARALLEL=8
LOGFILE=impdp_.log
EXCLUDE=SCHEMA:"IN ('OUTLN','SYSTEM','APX','SYSMAN','SYS','ORA_AUD','CDCSUB','ANONYMOUS','CDCPUB','CDCUTILS','PERFSTAT','XDB','AUDSYS','HP_DBSPI','WMSYS','AUDSYS','OJVMSYS','TSMSYS')"
EXCLUDE=tablespace,password_verify_function,profile,password_history,statistics
streams_configuration=N
keep_master=y

A few comments here to explain some of that (ignoring the obvious ones)

1. I'm doing a full import - this means i get roles, public synonyms etc etc - much more than just schema exports - however I'm explicitly excluding some schemas as they spew out errors on the import side and I don;t need any of them - they either already exist or are schema i don't want to move
2. as mentioned i exclude tablespaces , but i also exclude password_verify_function and profile as there is a bug in my version and these have to be manually done - this may or may not be the case for you -see MOS 1909477.1 for that
3. password history is excluded as it spews loads of errors and i don;t really need it anyway
4. stats are also excluded for a couple of reasons - and a point of note here as there is sometimes a misconception here - datapump is not gathering stats it's using dbms_stats.set_stats to explicit set the values in use in the source system. For some reason this is very slow and in my case it;s quicker to gather than again rather than set them which is crazy as surely gather stats has to set them using similar routines? Anyway collecting in 12c is probably a good idea anyway as there are a lot of changes since 10g......

So anyway notes out of the way - i kick off the process and it merrily runs - until it hits this

ORA-39126: Worker unexpected fatal error in KUPW$WORKER.FETCH_XML_OBJECTS [TABLE:"APPSCHEMA"."AQ$_TRADE_DATA_T_G"]
ORA-01801: date format is too long for internal buffer

Again this is a bug caused by this being upgraded from a really old db - MOS note 1311659.1. This is fixed by a dictionary hack

UPDATE IND$ SET SPARE6 = SYSDATE WHERE TO_CHAR (SPARE6, 'DD-MON-YY') ='00-000-00';

Don't try that without oracle support........

So i retry

And get this message for a couple of tables but it runs to completion

ORA-31679: Table data object "appschema"."CAML" has long columns, and longs can not be loaded/unloaded using a network link

So longs don't work over db links - in my case that's no big issue - the tables in question are tiny and there is no FK/PK on them - these can be done separately out to file and copied over.

Reviewing the multiple errors with other issues one is obvious and the fix for it will resolve a number of issues

PLS-00201: identifier 'DBMS_JAVA.START_IMPORT' must be declared

Java was not installed in the DB as we didn't think the app used it - but seems it does - so i now have to run initjvm to add that in - easy enough.

A re-run now is largely error free (after resetting the environment - using flashback in my case)

However it takes a while and there are a few simple things i can do to speed things up - here is the graph that covers most of the import progress

As you can see a few things stand out here that we can do - i call this tuning by colours.....

The brown is because the redo logs are tiny and i just have the OMF defaults - so i create some new groups and drop the small ones:

alter database add logfile size 1G; -- 3 times then drop the tiny default ones

alter database drop logfile group 1;
alter system switch logfile;
alter system checkpoint;
alter database drop logfile group 2;

The dark blue at the end is largely caused by the index build (some of which has to be done - i.e. reading the data - but some of it is sort i/o and i can reduce that by increasing the PGA).

SQL> alter system set pga_aggregate_target=4G;

After doing that the entire job can run in 55 minutes

For reference there is 60GB of application data and 7GB of indexes

From what I've seen previously this is actually pretty slow and i think there are a few reasons for that worth mentioning:

1. The source db is on a really old AIX 5.2 box so that's not helping
2. 10g doesnt help - there is a bug that it doesn't seem to do partition unload properly over db link from 10g to 12c
3. There is 1 table accounting for about 85% of the data, its very wide thanks to multiple char columns (each row is about 2K)

For reference the import generated 81GB of redo - more than i had imagined but roughly comparable with the data being loaded.

So what was the point of all that?

Well - it shows how simple the migration can be - you just have to put in a little ground work first to make it as quick as possible and iron out all the creases.

And for a further bit of info - a full export of the 12c database (to disk) - takes 8 minutes and this is more in line what I've seen with other testing on newer servers and versions. datapump can be incredibly quick.

↧

Do you pass the AUTHID test.....?

July 22, 2015, 1:16 pm

≫ Next: Startup quirks with depracated parameters

≪ Previous: Database migration with datapump 10g->12c

Now I'm not sure whether i already knew this and had just forgotten or whether i had genuinely missed this little 'feature' but it's useful to know this exists even if it's just so you don't look stupid when someone shows it to you.....

For you plsql people out there (i know some of you survived the onslaught of java and dotnet) you'll be familiar with the AUTHID clause within compiled plsql units - can't remember exactly when it came it but it's been there for a while, and i thought i knew what it did. Well i did know what it did but what i'd ~~overlooked~~ forgotten is a subtle difference in the way roles are treated when dynamic SQL is executed from within a stored program unit.

Lets do a quick example to demo what i mean

we do some initial setup of 2 basic users and give privilege on one of the schemas objects to a role.

SQL> create user bob identified by bob;
SQL> grant create session,create table, create procedure to bob;
SQL> create user dave identified by dave;
SQL> grant create session,create table, create procedure to dave;
SQL> create role bread;
SQL> grant unlimited tablespace to bob;
SQL> grant unlimited tablespace to dave;
SQL> create table bob.demotab(col1 number);
SQL> grant select,insert on bob.demotab to bread;
SQL> grant bread to dave;
SQL> insert into bob.demotab values (1);
SQL> commit;

So after that, dave has permissions to select and insert on a table called demotab owned by bob via the bread role (see what i did there....)

Now as I'm sure we all know if you create some stored plsql it won't compile - as demonstrated here

SQL> create or replace procedure demoproc IS
2 v_col number;
3 begin
4 select col1 into v_col from bob.demotab;
5 end;
6 /

Warning: Procedure created with compilation errors.

SQL> show errors
Errors for PROCEDURE DEMOPROC:

LINE/COL ERROR
-------- -----------------------------------------------------------------
4/1 PL/SQL: SQL Statement ignored
4/33 PL/SQL: ORA-00942: table or view does not exist

even though i can select from it normally in sqlplus

SQL> select * from bob.demotab;

COL1
----------
1

The reason is the role is disabled when the code is compiled so the rights to the table are lost.

If we change the code slightly to use AUTHID CURRENT_USER this happens

SQL> create or replace procedure demoproc authid current_user IS
v_col number;
begin
select col1 into v_col from bob.demotab;
end;
/
2 3 4 5 6
Warning: Procedure created with compilation errors.

SQL> show errors
Errors for PROCEDURE DEMOPROC:

LINE/COL ERROR
-------- -----------------------------------------------------------------
4/1 PL/SQL: SQL Statement ignored
4/33 PL/SQL: ORA-00942: table or view does not exist

i.e the exact same thing

What is interesting however is if we change the code to be dynamic SQL rather than static - so here is an example of that with the default definers rights

SQL> create or replace procedure demoproc IS
v_sql varchar2(4000);
begin
v_sql := 'insert into bob.demotab values (99)';
execute immediate v_sql;
end;
/ 2 3 4 5 6 7

Procedure created.

So the code compiles as the semantic checks aren't done at compile time now, but when we execute the code......

SQL> exec demoproc;
BEGIN demoproc; END;

*
ERROR at line 1:
ORA-00942: table or view does not exist
ORA-06512: at "DAVE.DEMOPROC", line 5
ORA-06512: at line 1

It fails as the role is still disabled at runtime

However - if we perform the same thing with AUTHID CURRENT_USER

SQL> create or replace procedure demoproc authid current_user IS
v_sql varchar2(4000);
begin
v_sql := 'insert into bob.demotab values (99)';
execute immediate v_sql;
end;
/ 2 3 4 5 6 7

Procedure created.

SQL> exec demoproc;

PL/SQL procedure successfully completed.

SQL> select * from bob.demotab;

COL1
----------
1
99

And it works! So roles are enabled from dynamic sql within authid current_user blocks.

Be honest did you know that....?

↧

Startup quirks with depracated parameters

July 29, 2015, 6:02 am

≫ Next: The datapump detective...

≪ Previous: Do you pass the AUTHID test.....?

I noticed a little quirk today in 12c when starting a database with an explicit pfile vs with an implicit spfile/pfile (actually there is no way to explicitly point at an spfile from a startup command - only pfile) with depracated parameters - I've no idea why there is a difference in the way this is handled and actually i personally prefer how it behaves with the explicit pfile....

Let me done what I'm talking about:

so if i start the db with spfile i get this

SQL> startup
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.
Total System Global Area 6413680640 bytes
Fixed Size                  3651344 bytes
Variable Size            3657435376 bytes
Database Buffers         2734686208 bytes
Redo Buffers               17907712 bytes
Database mounted.
Database opened.
SQL>

If i rename the spfile and use the default pfile i get this

SQL> startup
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.
Total System Global Area 6413680640 bytes
Fixed Size                  3651344 bytes
Variable Size            3657435376 bytes
Database Buffers         2734686208 bytes
Redo Buffers               17907712 bytes
Database mounted.
Database opened.
SQL>

i.e. exactly the same

However if i explicitly reference the pfile i get this

SQL> startup pfile=./initRICHLINE.ora
ORA-32006: LOG_ARCHIVE_START initialization parameter has been deprecated
ORA-32006: SEC_CASE_SENSITIVE_LOGON initialization parameter has been deprecated
ORACLE instance started.
Total System Global Area 6413680640 bytes
Fixed Size                  3651344 bytes
Variable Size            3657435376 bytes
Database Buffers         2734686208 bytes
Redo Buffers               17907712 bytes
Database mounted.
Database opened.
SQL>

Why this would be different who knows.....

You can of course find the bad parameters in other ways - alert log, database views etc - but i quite like immediateness of this feedback with the explicit pfile.

↧

The datapump detective...

July 31, 2015, 2:35 pm

≫ Next: Features that time forgot

≪ Previous: Startup quirks with depracated parameters

As some of you may have read we're migrating an old 9i system to 12c and are having to make use of streams to make up for the fact that CDC has been removed in 12c. This had all been going well and we were quite happily replicating from 12.1.0.1 to 10.2.0.4 (where we built a downstream CDC - it's not the cleanest solution but it gets the job done and it should be short lived).

You'll notice i said 'had' been going well, we hit a bug whereby we had spilled data to disk from the capture queue and the database crashed at this exact point - this left us in a state where capture would not restart due to a bug.

Now the bug was only fixed in 12.1.0.2 and the vendor of the 3rd party app wasn't keen on us moving to this version and would not offer 'full' support, we requested a one off backport (OOB) from oracle support for 12.1.0.1 but it wasn't clear how long this would take so we had to go ahead and at least try to upgrade to 12.1.0.2.

Which we did and it indeed fixed the issue - but created a new one - again with streams. Now i had written a draft blog on all of this but the xray scanner at Munich airport seemed to decide to destroy my hard drive ( it was working before it went in and wasn't soon afterwards....) - anyway that story may never see the light of day now.

However we then had an even more interesting tale of what happened next....

Oracle support backported the fix to 12.1.0.1 and made it available for linux x64 - great - so lets downgrade.

Which i duly did and it seemed to work very smoothly - nice to know that this process works well.

What happened next they was more than a little confusing...

So after the downgrade we had to drop and rebuild streams as it was totally stuck because of the issue we hit at 12.1.0.2. We ran the same scripts as before but something very odd was happening - we were defining 4 tables to be extracted in the plsql block calling datapump - but only one was being extracted......

I won't paste the whole block here but the key points of interest were these

object_name(1) := 'ZCOTRDDAY';
object_name(2) := 'TPOW';
object_name(3) := 'BOOK';
object_name(4) := 'STREAM_HEARTBEAT';

h1 := dbms_datapump.open(operation=>'EXPORT',job_mode=>'TABLE',
remote_link=>'',
job_name=>NULL, version=> min_compat);

The code was basically looping through the small array of values and then building up a filter string to pass to dbms_datapump. dbms_datapump was opened using the call above.

What was actually happening is that STREAM_HEARTBEAT was exported but nothing else was.

First i thought must be some bad plsql and went down that route but after debugging and checking it the plsql looked fine - so it must be datapump?

So i then assumed that maybe the downgrade hadn't done something properly (though there were no errors) so i'd rerun catalog and catproc - which i did.

And nothing changed

Then i thought i'd try command line datapump and see what happened

expdp / tables=zainet.tpow, zainet.book content=metadata_only reuse_dumpfiles=y

This worked fine - so the problem must be dbms_datapump - right? Well as they both do essentially the same thing under the covers this seemed unlikely. Then i realised that actually the 'call' to datapump was different - in the plsql version we were passing in the version parameter explicitly (and this was being fetched from the 10.2 database) - so i then tried that from the command line

expdp / tables=zainet.tpow, zainet.book version=10.2 content=metadata_only reuse_dumpfiles=y

And guess what......

no errors but the tables did not export....

must be a downgrade bug right?

So i ran catalog and catproc again.....

and nothing changed...

OK time to get serious lets trace this thing and turn it up to 11 (anything less than that for this kind of issue probably wouldn't be worth it)

expdp / tables=zainet.tpow, zainet.book version=10.2 content=metadata_only reuse_dumpfiles=y trace=1FF0300

heading to the trace directory and finding the log for the datapump worker process a quick scan of the file reveals this

META:11:29:36.325: get_xml_inputs TABLE_EXPORT/TABLE/TABLE_OBJNUM:
SELECT /*+all_rows*/ KU$.OBJ_NUM, KU$.TABLE_TYPE FROM SYS.KU$_10_1_TABLE_OBJNUM_VIEW KU$ WHERE NOT (BITAND (KU$.SCHEMA_OBJ.FLAGS,16)=16) AND KU$.BASE_OBJ.OWNER_NAME IN (SELECT UNIQUE object_schema FROM "OPS$ORACLE"."SYS_EXPORT_TABLE_01" WHERE process_order = -55 AND duplicate BETWEEN 1 AND 2) AND BITAND(KU$.SCHEMA_OBJ.FLAGS,4194304)=0 AND NOT EXISTS (SELECT 1 FROM SYS.KU$NOEXP_TAB A WHERE A.OBJ_TYPE='TABLE' AND A.NAME=KU$.BASE_OBJ.NAME AND A.SCHEMA=KU$.BASE_OBJ.OWNER_NAME) AND NOT EXISTS (SELECT 1 FROM SYS.KU$NOEXP_TAB A WHERE A.OBJ_TYPE='SCHEMA' AND A.NAME=KU$.BASE_OBJ.OWNER_NAME) AND (((ku$.schema_obj.owner_name,ku$.schema_obj.name) IN (SELECT object_schema, object_name FROM "OPS$ORACLE"."SYS_EXPORT_TABLE_01" WHERE process_order = -55 AND duplicate BETWEEN 1 AND 2)))
rowtag: ROW objnum_count: 0 callout: 8 Bind count: 0 Bind values:
META:11:29:36.325: Begin statement open
META:11:29:36.391: End statement open
META:11:29:36.391: metatrace-OPEN 41376.325 41376.391 .066 0 TABLE_EXPORT/TABLE/TABLE_OBJNUM
META:11:29:36.739: metatrace-FETCH 41376.391 41376.739 .348 1 TABLE_EXPORT/TABLE/TABLE_OBJNUM
META:11:29:36.739: SET_OBJECTS_FETCHED(2) called for TABLE_OBJNUM objects_fetched = 1
META:11:29:36.739: OBJECTS_FETCHED for TABLE_OBJNUM = 1

Now this is intriguing - there seems to be a version specific view that determines what datapump can extract - in this case SYS.KU$_10_1_TABLE_OBJNUM_VIEW

Lets look at the source of that to look for any clues...

select t.* from ku$_11_2_table_objnum_view t
where
NOT EXISTS (
select property from col$ c /* exclude tabs with virtual cols */
where c.obj# = t.obj_num
and bitand(c.property, 65536) >= 65536 /* virtual cols */
and bitand(c.property, 256) = 0 /* not a sysgen col */
and bitand(c.property, 32768) = 0) /* not unused */

Now this looks like a major clue - we are basically extracting anything in the 11.2 view apart from stuff which matches the criteria in the not exists clause.....

A quick query reveals

select c.name from col$ c, obj$ t /* exclude tabs with virtual cols */
where c.obj# = t.obj#
and bitand(c.property, 65536) >= 65536 /* virtual cols */
and t.name='TPOW'
/

NAME
--------------------------------------------------------------------------------
SYS_STSM_MC8HFVOZZTCAY2QTKG95P
SYS_NC00215$
SYS_NC00216$
SYS_STS4W6SM11$N7M7BC13DVHUTUU
SYS_STSC#WMWQ2GLEP04WAS49R4ZM3

Right...... what are those doing - there seem to be 2 distinct types of virtual column created - but i haven't added those.

However i know from reading various things that extra stuff is going on in v12 that might be causing this and i immediately do this

1 select index_name,FUNCIDX_STATUS from dba_indexes where table_name='TPOW' and owner='ZAINET'

2* and funcidx_status='ENABLED'

SQL> /

INDEX_NAME

--------------------------------------------------------------------------------

FUNCIDX_

--------

TPOW_TFIRM

ENABLED

TPOW_TRADTNUM

ENABLED

SQL> drop index zainet.TPOW_TFIRM;

Index dropped.

SQL> drop index zainet.TPOW_TRADTNUM;

Index dropped.

rerunning the query then reveals this

select c.name from col$ c, obj$ t /* exclude tabs with virtual cols */
where c.obj# = t.obj#
and bitand(c.property, 65536) >= 65536 /* virtual cols */
and t.name='TPOW'
/

NAME
--------------------------------------------------------------------------------
SYS_STSM_MC8HFVOZZTCAY2QTKG95P
SYS_STSQDSVGIY7_3QMVFD3Z#WODV2
SYS_STSH#5$FM9XDKSA1Z1GY468$F_
SYS_STS4W6SM11$N7M7BC13DVHUTUU
SYS_STSC#WMWQ2GLEP04WAS49R4ZM3

So - that's kind of progress - we've deleted the FBI's (only temporarily to allow is to move past this issue) - but what are the other virtual columns (and actually why have i got 2 extra ones now)?

Well - turns out they are extended statistics - so lets get rid of those - with a quick bit of sqlplus trickery

select 'exec dbms_stats.drop_extended_stats('||''''||owner||''''||','||''''||table_name||''''||','||''''||EXTENSION||''''||');'
from dba_stat_extensions
where owner='ZAINET' and table_name='TPOW'

This returns a list of sql statements i can run to drop all the extended stats - which i do and it all works apart from this one

SQL> exec dbms_stats.drop_extended_stats('ZAINET','TPOW','("AUDIT_ALF","AUDIT_AOHM","TRADE_TNUM","TRADE_TDATE")');
*
ERROR at line 1:
ORA-00001: unique constraint (SYS.I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST) violated
ORA-06512: at "SYS.DBMS_STATS", line 13020
ORA-06512: at "SYS.DBMS_STATS", line 13074
ORA-06512: at "SYS.DBMS_STATS", line 45105
ORA-06512: at line 1

The only real hit i could find for this was an unanswered forum question - but i knew a way to get past it - just delete all of the optimizer stats history.....

SQL> exec DBMS_STATS.PURGE_STATS(DBMS_STATS.PURGE_ALL);

PL/SQL procedure successfully completed.

Now i try again and all is well.......

SQL> exec dbms_stats.drop_extended_stats('ZAINET','TPOW','("AUDIT_ALF","AUDIT_AOHM","TRADE_TNUM","TRADE_TDATE")');

PL/SQL procedure successfully completed.

And now when we do a datapump extract with version=10.2 it all works! SO the problem was 12c added some virtual columns which then precluded that table from being extracted in 10.2 format - i would imagine this was really intended to deal with customer created virtual columns and not hidden internal columns - perhaps the SYS.KU$_10_1_TABLE_OBJNUM_VIEW view should be slightly amended to deal with these special cases.

Anyway quite an interesting issue that was nicely explained by doing a bit of tracing and detective work.

↧

Features that time forgot

August 4, 2015, 10:38 am

≫ Next: Function tuning with dual

≪ Previous: The datapump detective...

I saw there was a new feature when looking at Neil's blog earlier today which is quite nice (only from 12.1.0.2)

https://chandlerdba.wordpress.com/2015/07/14/locking-privileges-in-oracle/

It's been missing for a while, though personally it's never really caused an issue - nice that it is solved though.

When i clicked through to look at the oracle docs though i found this little snippet of code further down

GRANT REFERENCES (employee_id), 
      UPDATE (employee_id, salary, commission_pct) 
   ON hr.employees
   TO oe;

Hmm - that's interesting i thought i didn't think you could do that (granting access just on specific columns) - but no mention that it was new so I started looking back in the docs - i got back as far as 8i (seems they didn't digitize the papyrus of v8 and earlier yet - at least not that i could find). And guess what it's been there all this time and it's completely passed me by - shows the importance of reading the docs when new stuff comes even if it's to pick up stuff you should have known 20 years ago.......

↧

Function tuning with dual

August 7, 2015, 4:36 am

≫ Next: SQL Developer saves the day

≪ Previous: Features that time forgot

Yesterday i had an email from a developer asking me this:

"We're running an extract report and when we run it in plsql developer we get rows back immediately, however when we do a full extract of the entire dataset to a file it takes 2 hours - what's going on? "

(with the subtle hint of course that it is a database issue ant not something they've done wrong).

I first explained that getting a few rows back in an IDE isn't actually doing all the work - oracle has just returned a few rows quickly but hasn't actually done everything you asked it for - so you can't use that as a performance benchmark.

However of course there is an issue - this extract (which incidentally returns 3 million rows - so it's not that small) is taking two hours and that's an issue for the business and we need to get the run time down - so i need to dig a little deeper.

The actual query is just this

select * from firm_group_view where scenario_id = 11295295

which looks very simple at first glance though the name including the word 'view' probably gives you some hint that it's not just a simple matter to get these results.

So looking at the view code it's not overly complex, it's an 8 table join union all'd with an 8 table join - a lot of the tables are small and it's only really 2 big tables that contain the large data volume.

Looking at the actual plan it's using (in sql monitoring here so i get some real time info on what's it's doing where) and the plan actually looks ok - it just seems to be slow at doing it) - extract from that below

Initially i though that changing the join order by use of the leading hint to do the 'big tables first might help leaving just the 'lookup' extra values until the end might help so i went ahead and did that and indeed the plan changed as expected but the performance was much the same - so what was going on?

Well another look at the code revealed the use of a function call in the statement - this is something that the explain plan (and indeed the sqlmonitoring) do not factor in - it doesn't obviously show up at all. The function can be doing a huge amount of work and it only becomes apparent by doing tracing or looking at what other SQL the session is executing. It may appear to be slow on step 'X' in the plan but in fact this is the stage where it's executing the function......

The function in this case was

select
CALCULATE_PHASE_IN_FACTOR(tls.tp_level_id,rmsd.timeslot_local,ms.scenario_date) as Calc_Factor,

So i tried commenting out the function and replacing it with a static value just to see what happened

select
1 as Calc_Factor,

Now when we run the query it's massively faster confirming that the function is indeed the problem.

Whatever it is running in that function is running 3 million times - so something being a tiny bit slow adds up to a lot of time.

So we're making progress - time to investigate the function

So i open it up and something immediately jumps out (and this same construct is used twice in the same function...)

SELECT COUNT(*) INTO V_WORKING_DAYS
FROM (
SELECT rownum rnum
FROM all_objects
WHERE rownum <= I_CUT_OFF_DATE + 1 - ADD_MONTHS((LAST_DAY(I_CUT_OFF_DATE) + 1),-1)
)
WHERE to_char( ADD_MONTHS((LAST_DAY(I_CUT_OFF_DATE) + 1),-1) + rnum-1, 'DY' )
NOT IN ( 'SAT', 'SUN' );

what?

This is a really strange construct to use in 'application' code but i can see what they are doing - they just need something that has a lot of rows to provide a 'dummy' table to work with.

In fact i googled this code as i guessed the developers had borrowed it from somewhere and discovered it in an asktom question from 13 years ago......

What the developers are actually trying to do is work out the number of working days in a set month to be used for later calculations - this is not as trivial as it first sounds and actually there are quite a few solutions posted on the internet for this.

In order to just do a 'quick win' and not mess around too much with the code i decided to replace the use of all objects with a dummy row generator (i'm not sure who came up with this trick originally but i shamelessly borrowed the code from Jonathan's blog to save me some time as i saw him use it at a recent Oraclemidlands event)

So the code now becomes

with temparray as ( select
rownum id
from dual
connect by
level <= 50
)
SELECT COUNT(*) INTO V_WORKING_DAYS
FROM (
SELECT rownum rnum
FROM temparray
WHERE rownum <= LAST_DAY(I_CUT_OFF_DATE) + 1 - ADD_MONTHS((LAST_DAY(I_CUT_OFF_DATE) + 1),-1)
)
WHERE to_char( ADD_MONTHS((LAST_DAY(I_CUT_OFF_DATE) + 1),-1) + rnum-1, 'DY' )
NOT IN ( 'SAT', 'SUN' );

So instead of using all_objects (which has to do quite a bit of work with multiple dictionary tables) we now just make use of 'dual'.

This individual query run directly is now more than a factor of 10 faster (and we run it twice) so when we call it this function should now be 20x faster.

I also considered result_cache at this point but it seems that the data being passed to the function is rarely repeated so that produces no tangible benefit.

The extract process now completes in about 28 minutes

It can still be improved further by better designing the logic in what the function does and avoiding the double call to very similar processing but larger changes like that require more effort and testing and the gain from that is small relative to the effort - we've fixed a large part of the issue and the payoff for more effort is probably not worth it unless this is some absolutely time critical report that can make a huge difference to business process.

An interesting tuning case and a reminder that the explain plan is often not the whole story of what is going on.

↧