Thanks, Robert. A couple of additional questions.

  • Would you say the Used Size from GetGlobalSize (Size argument from code above; 725 MB) is more correct or the %GSIZE result is more correct (760 MB)?
  • What's your take on GetGlobalSizeBySubscript when called for all first level subscripts yielding a magnitude lesser number in total, 8.56 MB? Could it be it returns only contents of purely ^GLOBAL(sub1) and not returning contents of ^GLOBAL(sub1,sub2)? BTW, I saw in the debugger that  GetGlobalSizeBySubscript calls %GSIZE internally.

That's the code I ended up with. Thanks for your help, everybody!

 ; str is parsed into two arrays, words and separators (spaces and punctuation)
 ; Trim leading and trailing spaces here if needed
 L=$L(str),(currWord,currSep)="",cnt=0
 i=1:1:{
  S currChar=$E(str,i,i)
  I $MATCH(currChar,"\w") {
     currWord=currWord_currChar
     currSep'="" {
       sepAr(cnt)=currSep,currSep=""
   }
}
  ELSE {
   currSep=currSep_currChar 
   currWord'="" {
     cnt=cnt+1,wordAr(cnt)=currWord,currWord=""
   }
  }
 }