Whenever I come across this problem, my first instinct is to go through every character and build a string, something like this:
DECLAREOf course that means I have to go through the loop at least 18 times. How could I loop less?
l_string VARCHAR2(100) := 'THIS_LITTLE_STRING';
l_character VARCHAR2(1);
BEGIN
FOR i IN 1..LENGTH( l_string ) LOOP
l_character := l_character || SUBSTR( l_string, i, 1 );
END LOOP;
END;
/
Well, I would need to know the delimiter first, in this case, it will be the underscore character:
DECLAREOK, that helps. So what can I do with that?
l_string VARCHAR2(100) := 'THIS_LITTLE_STRING';
l_character VARCHAR2(1);
l_delimiter VARCHAR2(1) := '_';
BEGIN
...
DECLAREOK, now I remember what I always get confused. There seems to be this endless stream of SUBSTR and INSTR involved. It's difficult to follow...for me anyway. So here's what I've come up with for now. I am aware of regular expressions and the DBMS_UTILITY.COMMA_TO_TABLE procedures, but for some reason, I like to reinvent the wheel. No, not really, I just find it challenging. More challenging would be for me to start using regular expressions...Anyway, here goes my attempt at splitting a URL string:
l_string VARCHAR2(100) := 'THIS_LITTLE_STRING';
l_new_string VARCHAR2(100) := l_string;
l_token VARCHAR2(30);
l_character VARCHAR2(1);
l_delimiter VARCHAR2(1) := '_';
BEGIN
LOOP
EXIT WHEN l_new_string IS NULL;
l_token := SUBSTR( l_new_string, 1, INSTR( l_new_string, '_', 1 ) );
...
DECLAREVoila! My own URL String Tokenizer.
TYPE my_table IS TABLE OF VARCHAR2(100);
l_table MY_TABLE := MY_TABLE();
l_string VARCHAR2(1000) := 'TESTING=YES&&BOLLOCKS=SOMETHING&&&testing=DF';
l_keyvalue VARCHAR2(100);
l_delimiter VARCHAR2(5) := '&&';
l_delimiter_length INTEGER := LENGTH( l_delimiter );
l_counter INTEGER := 0;
l_new VARCHAR2(4000);
l_start INTEGER := 1;
l_end INTEGER := 0;
BEGIN
IF SUBSTR( l_string, -1, 1 ) <> l_delimiter THEN
l_string := l_string || l_delimiter;
END IF;
l_new := l_string;
LOOP
l_counter := l_counter + 1;
l_end := INSTR( l_new, l_delimiter, 1 );
l_keyvalue := SUBSTR( l_new, 1, l_end - 1 );
EXIT WHEN l_keyvalue IS NULL;
l_table.EXTEND(1);
l_table(l_counter) := l_keyvalue;
l_start := l_start + ( l_end + ( l_delimiter_length - 1 ) );
l_new := SUBSTR( l_string, l_start );
END LOOP;
p( 'table counter: ' || l_table.COUNT );
FOR I IN 1..l_table.COUNT LOOP
p( 'string: ' || l_table(i) );
END LOOP;
END;
/
CJUSTICE@TESTING>@T
table counter: 3
string: TESTING=YES
string: BOLLOCKS=SOMETHING
string: &testing=DF
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.03
Now I just have to write something that will split up the key/value pairs...
5 comments:
While I don't use RegExp, I really should, especially as I seem to be doing a great deal of string parsing. I've gotten handy with INSTR and SUBSTR etc, but yeah, RegExp would be better.
I find that I end up often doing the character by character thing when I do VBA work, primarily because I don't really know VBA, but I need to use it a great deal (annoying, yes). But I'm getting better there too.
This kind of char by char is the parsing equivilent to RBAR.
;)
@moleboy
Had to look up RBAR, is that a SQL Serverism?
Even on the odd occasion I use Java I go straight for StringTokenizer (I've even used a StreamTokenizer as well). For some reason I knew, or was aware, of these already solved problems and never ventured down the char by char path.
Dunno the derivation of RBAR. I first stumbled into it when my brother (also a database geek and significantly smarter than I am...at least with theory) was talking about some insane update he had to do to 15 million rows and he caught himself writing a pl/sql procedure to do it RBAR instead of a sql statement.
It was actually an interesting conversation weighing the pros and cons of the "slow but easy to write and support" PL/SQL method and the "complex but fast and efficient" SQL update.
The complex, but efficent sql to do this would be:
select substr( tmp, 1, instr( tmp || sep, sep ) - 1) split
from ( select substr( url, instr( sep || url, sep, 1, level ) ) tmp
, sep
from ( select 'TESTING=YES&&BOLLOCKS=SOMETHING&&testing=DF' url
, '&&' sep
from dual
)
connect by level <= length( url ) - length( replace( url, sep, '' ) ) + 1
)
Hi ,
Really need some help on parsing URL string with Oracle Regular Expression.
The new_request_page_str needs to be broken up like this:
Example 1:
https://www23.fs.ml.com/marketdata/marketdataisapilib.dll
Result 1:
Web_Page_Str: https://www23.fs.ml.com/marketdata/marketdataisapilib.dll
Domain_Name: fs.ml.com
Website_name: www23
Resource_Path: /marketdata/marketdataisapilib.dll
Directory_1_name: /marketdata
Directory_2_name: null
.
.
.
Directory_10_name: null
Resource_name: marketdataisapilib.dll
Resource_type: dll
Example 2:
https://www23.fs.ml.com/marketdata/allquote.js
Results 2:
Web_Page_Str: https://www23.fs.ml.com/marketdata/allquote.js
Domain_Name: fs.ml.com
Website_name: www23
Resource_Path: /marketdata/allquote.js
Directory_1_name: /marketdata
Directory_2_name: null
.
.
.
Directory_10_name: null
Resource_name: allquote.js
Resource_type: js
Example 3:
https://www23.fs.ml.com/images/downarrow.gif
Result 3:
Web_Page_Str: https://www23.fs.ml.com/images/downarrow.gif
Domain_Name: fs.ml.com
Website_name: www23
Resource_Path: /images/downarrow.gif
Directory_1_name: /images
Directory_2_name: null
.
.
.
Directory_10_name: null
Resource_name: downarrow.gif
Resource_type: gif
Post a Comment