Good Singapore Maths Students Would Likely Make Good Oracle DBAs (Problems)

An interesting mathematics based question from a Singapore high school exam has been doing the internet rounds in the past few days. Considering it’s aimed at high school students, it’s a tricky one and obviously designed to filter out the better students, in a country with a very good reputation for churning out mathematically gifted […]

Ever Wondered How Pythian is Kind of Like a Fire Truck?

pierce____enforcerg____-54ee03db38d10

 

I have.

Coming from the world of selling fire trucks I’m used to selling necessary solutions to customers in need. The stakes are high. If the truck doesn’t perform best case scenario it’s a false alarm. Worst case scenario someone, many people, die.

Let me tell you a bit about fire trucks.

A lot of people think that a fire truck is a fire truck. That there is some factory where fire trucks are made, carbon copies of one another, varying only in what they carry – water, a pump, a ladder. That’s not the case. Every truck is custom engineered, designed, and manufactured from scratch. Things can go wrong. In a world where response time is everything, you don’t want something to go wrong. Not with the fire truck. Not when everything else is going wrong. Not when someone is trapped in their vehicle. Not when a house is burning down.

For the past five years I have been selling disaster management systems. There has been a clear, immediate, pressing need from my customers. I loved the urgency, I fed off that energy, helping people in charge of saving lives come up with solutions that help them do just that. When first walking into Pythian, I didn’t understand the importance of data, I didn’t comprehend the stakes. But they are present and the analogy can be made.

Pythian’s services are like a fire truck.

Data is like your house, your car, your life. When your business is dependent on your data and your data fails, your business fails. Data failures are serious. Downtime causes huge revenue losses as well as loss of trust and reputation. Identity theft, loss of security, these disasters are pressing threats in our digitized society.

Pythian’s FIT-ACER program is like your Fire Marshall.

We don’t just prepare for disasters, we help prevent them. Modeled after the Mayo Clinic’s patient checklist, Pythian’s FIT-ACER human reliability check acknowledges that no matter how intelligent our DBAs are (http://www.pythian.com/experts/) they can still make mistakes:

FIT-ACER: Pythian Human Reliability Checklist

F

Focus (SLOW DOWN! Are you ready?)

A

Assess the command (SPEND TIME HERE!)

I

Identify server/DB name, time, authorization

C

Check the server / database name again

T

Type the command (do not hit enter yet)

E

Execute the command

R

Review and document the results

We don’t just hire the best to do the best work, we hire the best, make sure they’re at their best, check their best, and apply their best. Every time we interact with your data we do so at a high level to improve your system, to prevent disaster.  And we answer our phones when disaster hits.

The average response time for a fire crew in Ontario is 6 minutes. The average response time for Pythian is under 4.

Take it from someone who knows disaster,

Pythian’s the best fire truck around.

KeePass 2.29

KeePass 2.29 has been released. You can find the downloads and changelog in the usual places.

Read about my Adventures with DropBox and KeePass to see how I make use of it.

Cheers

Tim…


KeePass 2.29 was first posted on April 14, 2015 at 7:26 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

Community dinner @ Pedro’s

Folks, as usual Pythian is organizing the community dinner. After many years, food, (responsible) drinking and photos, this event has become an important moment for all of us, to know each other better, discuss and have fun.

This year is also the 20th year for MySQL so … YEAAAH let us celebrate, with more food, fun and responsible drinking.

If you had not done it yet … register yourself here: https://www.eventbrite.com/e/the-pythian-mysql-community-pay-your-own-way-dinner-tickets-15692805604

Info about the event:

When: Tuesday April 14, 2015 – 7:00 PM at Pedro’s (You are welcome to show up later, too!)
Where: Pedro’s Restaurant and Cantina – 3935 Freedom Circle, Santa Clara, CA 95054

 

I know, I know … we are that kind of people that decide where to go at the last minute, and every year we do the same, but if you could register, that will help us to organize it better … and c’mon the dinner is on Tuesday … so we are almost there!!!

 

Anyhow, hope to see all of you there, all of you!

Some reference: Menu Eventbrite Pedro

Technology for the Non-Technical

I am potentially one of the least technical people in my generation. I’m 30 and I am afraid of my cellphone, my laptop, Netflix, the microwave…. Okay, afraid is maybe a strong word, but baffled by them at the very least.

In high school, while my classmates wrote most of their papers and assignments on the computer, I insisted on writing everything out by hand and only typed it out afterwards if absolutely required. It wasn’t that I had issues with typing – my mom who worked as an administrator for many years made sure that I learned to type from a very young age and I type quickly with a reasonable amount of accuracy. I just felt that writing by hand kept me more “connected” to the words I penned. Simply, my name is Sarah and I am a Luddite.

After high school I studied journalism for a couple of years and then entered the workforce into a number of different jobs, such as in sales and marketing and it became necessary for me to “engage” with technology a little more heavily. Typing articles and assignments slowly became second nature but grocery lists, thank you notes, birthday cards all continued to be written by hand.

For the last few years I’ve been working for technology and IT organizations, and for the last 14 months I’ve been working with Pythian, a leading IT services provider specializing in data infrastructure management. That was a big leap for me. Not only was I required to use technology constantly in my day-to-day (Smartphone, CRM system, soft phone, multiple email interfaces ACK!), but I also needed to do a lot more than dip my toes into some fairly intense technical knowledge to gain an understanding of our client base and what solutions would be most appropriate for the people I speak to every day. These people are Chief Information Officers, Chief Technology Officers’s and Vice Presidents of Information Technology for companies that are incredibly data-dependent. The quality and security of their data management directly affects their revenue and it’s critical that it is handled with a great amount of expertise and attention to detail. Kind of intimidating.

I have spent the last year wrapping myself in terms like NoSQL, non-relational database, Hadoop, MongoDB, SQL Server and Oracle. Do I have a perfect understanding of the benefits and draw-backs of each of these yet? No. What I do have is a great network of technical geniuses who work with me who have spent their careers becoming experts in their respective technologies. I know who the best resources are and how to connect with them to get the best answers and solutions. I’m very lucky to work at company that is incredibly transparent – questions are always welcomed and answered. I sit sandwiched between the offices of the Chief Revenue Officer and the CEO and Founder of our organization and while both are incredibly busy people, they are also happy to answer questions and share their insights and energy with anyone here.

All of our technical resources are just an instant message away and can often answer my questions in a few concise lines. So, while I am still monstrously uncomfortable with tasks like defragging (sounds like organized Fraggle removal to me) my computer or resetting my smartphone when it acts up, I am coming along slowly, in baby steps – an IT late-bloomer you could say – and it’s all much less painful than I ever feared it would be.

My thoughts on the Resilience of Cassandra

This blog is a part 1 of a 2 in a series. This will be different from my previous blogs, as this is more about some decisions you can make with Cassandra regarding the resilience of your system. I will talk deeply about this topic in the upcoming Datastax Days in London (https://cassandradaylondon2015.sched.org/), this is more of an introduction!

TL;DR: Cassandra is tough!

Cassandra presents itself as a “Cassandra delivers continuous availability, linear scalability, and operational simplicity across many commodity servers with no single point of failure, along with a powerful data model designed for maximum flexibility and fast response times.“ (http://docs.datastax.com/en/cassandra/2.0/cassandra/gettingStartedCassandraIntro.html). In a production system, having your persistence layer failure tolerant is a big thing. Even more so when you can make it resilient to full locations failure through geographic replication (and easily).

As in any production system you need to plan for failure. Should we blindly trust in Cassandra resilience and forget about the plan because “Cassandra can handle it”? By reading the documentation, some may think that by having several data centers and a high enough replication factor we are covered. In part this is true. Cassandra will handle servers down, even a full DC (or several!) down. But, anyway, you should always prepare for chaos! Failure will increase pressure on your remaining servers, latency will increase, etc. And when things get up again, will it just work? Getting all data in sync, are you ready for that? Did you forgot about gc_grace_seconds? There are lots of variables and small details that can be forgotten if you don’t plan ahead. And then in the middle of a problem, it will not help having those details forgotten!

My experience tells me that you must take Cassandra failures seriously, and plan for them! Having a B plan is never a bad thing, and a C even. Also, make sure those plans work! So for this short introduction I will leave a couple of recommendations:

  • Test your system against Cassandra delivering a bad service (timeouts, high latency, etc).
  • Set a “bare minimum” for your system to work (how low can we go on consistency, for example).
  • Test not only your system going down, but also prepare for the coming up!
  • Keep calm! Cassandra will help you!

Overall, Cassandra is a tough and robust system. I’ve had major problems with network, storage, Cassandra itself, etc. And in the end Cassandra not only survived, it gave me no downtime. But with every problem I had, it increased my knowledge and awareness of what I could expect. This lead to planning for major problems (which did happen) and this combined with the natural resilience of Cassandra made me go through those events without downtime.

Fell free to comment/discuss about it, in the comment section below! Juicy details will be left for London!

Not Exists

The following requirement appeared recently on OTN:


=========================================================================================================
I have a following query and want to get rid of the "NOT EXISTS' clause without changing the end results.

SELECT   A.c,
         A.d,
         A.e,
         A.f
  FROM   A
WHERE   NOT EXISTS (SELECT   1
                       FROM   B
                      WHERE   B.c = A.c AND B.d = A.d AND B.e = A.e);
===========================================================================================================

Inevitably this wasn’t the problem query, and almost inevitably the OP was asking us how to implement a solution which wasn’t appropriate for a problem that shouldn’t have existed. Despite this it’s worth spending a little time to take the request at its face value and look at the sort of thing that could be going on.

First, of course, you cannot get rid of the “not exists” clause, although you may be able to make it look different. If you want “all the rows in A that are not referenced in B” then you HAVE to examine all the rows in A, and you have to do some sort of check for each row to see whether or not it exists in B. The only option you’ve got for doing something about the “not exists” clause is to find a way of making it as a cheap as possible to implement.

A couple of people came up with suggestions for rewriting the query to make it more efficient. One suggested writing it as a “NOT IN” subquery, but it’s worth remembering that the optimizer may cheerfully transform a “NOT IN” subquery to a “NOT EXISTS” subquery if it’s legal and a manual rewrite may overlook the problem of NULLs; another suggested rewriting the query as an outer join, but again it’s worth remembering that the optimimzer may transform a “NOT EXISTS” subquery to an “ANTI-JOIN” – which is a bit like an outer join with filter, only more efficient. So, before suggesting a rewrite, it’s worth looking at the execution plan to see what the optimizer is doing just in case it’s doing something silly. There are two options – anti-join or filter subquery.

Here, with code I’ve run under 10.2.0.5 to match the OP, is a demonstration data set, with the two plans you might expect to see – first, some the data:


execute dbms_random.seed(0)

create table t1
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(0,4))           c,
        trunc(dbms_random.value(0,5))           d,
        trunc(dbms_random.value(0,300))         e,
        rownum                                  f,
        rpad('x',100)                   padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table t2
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(0,4))           c,
        trunc(dbms_random.value(0,5))           d,
        trunc(dbms_random.value(0,300))         e,
        rownum                                  f,
        rpad('x',100)                   padding
from
        generator       v1,
        generator       v2
where
        rownum <= 24000
;

create index t1_i1 on t1(c,d,e);
create index t2_i1 on t2(c,d,e);

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T2',
                method_opt       => 'for all columns size 1'
        );
end;
/

The OP had followed up their original query with a claim that “Table A holds 100 million rows and table B holds 24,000″ – that’s a lot of checks (if true) and you ought to be asking how quickly the OP expects the query to run and how many of the 100 M rows are going to survive the check. I’ve set up just 1M rows with 6,000 distinct values for the column combination (c,d,e), and a reference table with 24,000 rows which are likely to include most, but not all, of those 6,000 combinations.

Rather than generate a very large output, I’ve written a query that generates the required data set, then counts it:


select
        max(f), count(*)
from (
        SELECT   /*+ no_merge */
                 A.c,
                 A.d,
                 A.e,
                 A.f
          FROM   t1 A
        WHERE   NOT EXISTS (SELECT   /* no_unnest */
                                      1
                               FROM   t2 B
                              WHERE   B.c = A.c AND B.d = A.d AND B.e = A.e)
)
;

This took about 0.35 seconds to run – aggregating roughly 14,500 rows from 1M. The plan was (as I had expected) based on a (right) hash anti join:


---------------------------------------------------------------------------------
| Id  | Operation               | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |       |     1 |    13 |  2183   (5)| 00:00:11 |
|   1 |  SORT AGGREGATE         |       |     1 |    13 |            |          |
|   2 |   VIEW                  |       |   999K|    12M|  2183   (5)| 00:00:11 |
|*  3 |    HASH JOIN RIGHT ANTI |       |   999K|    23M|  2183   (5)| 00:00:11 |
|   4 |     INDEX FAST FULL SCAN| T2_I1 | 24000 |   234K|    11  (10)| 00:00:01 |
|   5 |     TABLE ACCESS FULL   | T1    |  1000K|    14M|  2151   (4)| 00:00:11 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("B"."C"="A"."C" AND "B"."D"="A"."D" AND "B"."E"="A"."E")

Oracle has built an in-memory hash table from the 24,000 rows in t2, then scanned the t1 table, probing the hash table with each row in turn. That’s 1M probe in less than 0.35 seconds. You ought to infer from this that most of the time spent in the original query should have been spent scanning the 100M rows, and only a relatively small increment appear due to the “not exists” clause.

You’ll notice, though that there was a comment in my subquery with the /* no_unnest */ hint embedded – if I change this from a comment to a hint (/*+ */) I should get a plan with a filter subquery, and maybe that’s what’s happening to the OP for some odd reason. Here’s the plan:


------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |    13 | 15166   (1)| 00:01:16 |
|   1 |  SORT AGGREGATE      |       |     1 |    13 |            |          |
|   2 |   VIEW               |       |   999K|    12M| 15166   (1)| 00:01:16 |
|*  3 |    FILTER            |       |       |       |            |          |
|   4 |     TABLE ACCESS FULL| T1    |  1000K|    14M|  2155   (4)| 00:00:11 |
|*  5 |     INDEX RANGE SCAN | T2_I1 |     4 |    40 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter( NOT EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "T2" "B"
              WHERE "B"."E"=:B1 AND "B"."D"=:B2 AND "B"."C"=:B3))
   5 - access("B"."C"=:B1 AND "B"."D"=:B2 AND "B"."E"=:B3)

The query took 1.65 seconds to complete. (And re-running with rowsource execution statistics enabled, I found that the subquery had executed roughly 914,000 times in that 1.65 seconds). Even if the original query had used the filter subquery plan the subquery shouldn’t have made much difference to the overall performance. Of course if T2 didn’t have that index on (c,d,e) then the filter subquery plan would have been much more expensive – but then, we would really have expected to see the hash anti-join.

If you’re wondering why the subquery ran 914,000 times instead of 1M times, you’ve forgotten “scalar subquery caching”.  The session caches a limited number of results from subquery execution as a query runs and may be able to use cached results (or simply a special “previous-execution” result) to minimise the number of executions of the subquery.

Did you notice the index I created on t1(c,d,e) ? If I drive the query through this index I’ll access all the rows for a given combination of (c,d,e) one after the other and only have to run the subquery once for the set. To make this happen, though, I’ll have to declare one of the columns to be NOT NULL, or add a suitable “column is not null” predicate to the query; and then I’ll probably have to hint the query anyway:


select
        max(f)
from (
        SELECT   /*+ no_merge index(a) */
                 A.c,
                 A.d,
                 A.e,
                 A.f
          FROM   t1 A
        WHERE   NOT EXISTS (SELECT   /*+ no_unnest */
                                      1
                               FROM   t2 B
                              WHERE   B.c = A.c AND B.d = A.d AND B.e = A.e)
        and     c is not null
)
;

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     1 |    13 | 65706   (1)| 00:05:29 |
|   1 |  SORT AGGREGATE               |       |     1 |    13 |            |          |
|   2 |   VIEW                        |       |   999K|    12M| 65706   (1)| 00:05:29 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T1    | 50000 |   732K| 52694   (1)| 00:04:24 |
|*  4 |     INDEX FULL SCAN           | T1_I1 | 50000 |       |  2869   (2)| 00:00:15 |
|*  5 |      INDEX RANGE SCAN         | T2_I1 |     4 |    40 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("C" IS NOT NULL AND  NOT EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM
              "T2" "B" WHERE "B"."E"=:B1 AND "B"."D"=:B2 AND "B"."C"=:B3))
   5 - access("B"."C"=:B1 AND "B"."D"=:B2 AND "B"."E"=:B3)

Re-running this code with rowsource execution statistics enabled showed that the subquery ran just 6,000 times (as expected) – for a total run time that was slightly faster than the hash anti-join method (0.17 seconds – but I do have a new laptop using SSD only, with a 3.5GHz CPU and lots of memory).

Every which way, if we can get reasonable performance from the underlying table access there’s no way that introducing a “NOT EXISTS” ought to be a disaster. The worst case scenario – for some reason Oracle chooses to run a filter subquery plan and the appropriate index hasn’t been created to support it.

Footnote:

Of course, table A didn’t really exist, it was a three table join; and it didn’t produce 100M rows, it produced anything between zero and 5 million rows, and the effect of the subquery (which correlated back to two of the joined tables) was to leave anything between 0 and 5 million rows. And (apparently) the query was quick enough in the absence of the subquery (producing, for example, 1 million rows in only 5 minutes), but too slow with the subquery in place.

But that’s okay. Because of our tests we know that once we’ve produced a few million rows it takes fractions of a second more to pass them through a hash table with an anti-join to deal with the “not exists” subquery; and I doubt if we have to play silly games to push the data through a filter subquery plan in the right order to squeeze a few extra hundredths of a second from the query.

If the OP is happy with the basic select statement before the “not exists” subquery, all he has to do is take advantage of a no_merge hint:


select  {list of columns}
from
        (
        select /*+ no_merge */ .... rest of original query
        )    v1
where
        not exists (
                select  null
                from    b
                where   b.c = v1.c and b.d = v1.d and b.e = v1.e
        )
;

You’re probably wondering why the OP currently sees a performance problem as the subquery is added. The best guess is that the subquery has introduce a “magic 5% fudge factor” to the arithmetic (did you notice the cardinality of t1 dropping to 50,000 from 1M in the plan above) and made it pick a worse execution plan for the rest of the query. We can’t tell, though, since the OP hasn’t yet given us the information that would allow us to see what’s going wrong.