Dan explains it all very well, and I take back my earlier comments.
If I didn't want to see 2.0 ratified soon, I might suggest that 
at least a part of Dan's explanation should be included in the spec,
but I don't think that we can afford the time.   On a related note,
the description of DT/DD still does not match the DTD with respect
to multiple <DD>s, but I don't think that it is worth delaying the 
spec any longer if it can be avoided.
> 
> Firstly, this paragraph says "should," so it's not binding.
> 
> But here's the way I'd like to see it done:
> 	1. break the body into block-structuring elements (and headings).
> 
> 	2. Take the data characters in the content of a block
> 	structuring element (or heading) and its descendants, and break
> 	it into words, delmited by spaces.
> 
> 	3. Typset the words into paragraphs. Put as much space
> 	between words as necessary to make it look nice, independent
> 	of where the spaces were in the source.
> 
> This little perl ditty may help illustrate:
> 
> $html = <<EOF;
> <li>  w1
> w2
> w3 <em>w4   w5 </em>
> 
> w6 <li>w7 w8
> w9 w10
> <li>w11
> EOF
> 
>     @paras = split(/<li>/, $html); # split body into paragraphs
> 				# a real parser would do this
> 				# as per SGML
> 
> shift(@paras); # perl's split operator creates an empty para
> 				# before the first <li>
> 
> grep(s-<(/)?\w+>--g, @paras); # get rid of markup inside para -- we're
> 				# only interested in data chars here.
> 
> print "start with this:\n$html";
> 
> foreach $p (@paras){
>     $p =~ s/^\s+//; # get rid of leading space
>     $p =~ s/\s+$//;		# and trailing space.
>     @words = split(/\s+/, $p);
> 
>     print "\ntypeset these words into a para: ", join(',', @words), "\n";
> }
> 
> its output is this:
> 
> start with this:
> <li>  w1
> w2
> w3 <em>w4   w5 </em>
> 
> w6 <li>w7 w8
> w9 w10
> <li>w11
> 
> typeset these words into a para: w1,w2,w3,w4,w5,w6
> 
> typeset these words into a para: w7,w8,w9,w10
> 
> typeset these words into a para: w11
> 
> 
> Dan