2013-03-22

Counting Character Occurrences in SQL

Which characters and how many times they are used in a rows of a column. Just change following src to a desired query.

with src as (
   SELECT ename col FROM emp
)
SELECT letter
     , COUNT (*) cnt
  FROM (    
 SELECT SUBSTR (col, LEVEL, 1) letter
   FROM (
  SELECT col
       , ROWNUM rn
       , LENGTH (col) len
    FROM src
        )
 CONNECT BY LEVEL <= len
        AND rn = PRIOR rn
        AND PRIOR DBMS_RANDOM.VALUE IS NOT NULL
        )
GROUP BY letter
ORDER BY letter
;

This approach is not the most efficient way to do this. But it does the job. For a 10 million rows varchar2(30) column it took about half an hour to calculate the result. Maybe in database map reduce approach could be used to speedup the execution. Something like my euler17 implementation.

No comments:

Post a Comment

About Me

My photo
I am Timo Raitalaakso. I have been working since 2001 at Solita Oy as a Senior Database Specialist. My main focus is on projects involving Oracle database. Oracle ACE alumni 2012-2018. In this Rafu on db blog I write some interesting issues that evolves from my interaction with databases. Mainly Oracle.