10
X

Three Quick Ways to Avoid Widows

read 24 comments

A few months ago I threw together a quick redesign of the Learning jQuery site. It's nothing fancy, mind you, but I was itching to retire the thin veil covering the tired old WordPress Kubrick theme, so something had to be done.

Almost immediately upon changing the font-family and font-size of the blog post titles, I noticed a few unsightly widows (just to clarify, we're talking about typographical widows. My mother already suspects me of avoiding her; I don't want to add to her anxiety. ;) ).

Here is an example of one such widow:

Continue Reading Below

See how the last word, "plugin," appears on its own line? According to a couple designerly friends of mine, that's a no-no. So, I considered for half a minute how to get that title to look more like this:

The lowly yet lovely non-breaking space ( ) would do the trick, but how to replace it for the regular, breaking space? I certainly wasn't about to manually edit all of the entries' titles. Not only would it take too long, but it would also pollute the markup with something that really shouldn't be there. No, what I needed was a little JavaScript.

Selecting the Titles

On this site, entry titles are wrapped in <h2><a href="foo"></a></h2>, which can be selected in jQuery with $('h2 a'). Easy. Now, because I want to manipulate the text of each title independently, I'll need to use the .each() method, which is basically a chainable for loop. Inside the .each() is where I substitute the last breaking space with the non-breaking variety. Here are three ways I came up with to achieve this.

Array

The first approach was to convert the title string into an array of words and then stitch the array items back together, dealing with the last one specially.

JavaScript:
  1. $(document).ready(function() {
  2.   var h2Text = '';
  3.   $('h2 a').each(function() {
  4.     var h2Array = $(this).text().split(' '),
  5.       h2Last = h2Array.pop();
  6.     h2Text = h2Array.join(' ') + ' ' + h2Last;
  7.     $(this).html(h2Text);
  8.   });  
  9. });

A couple things to note about line 5 above: (a) the variable is actually being declared (with a "var") because it's separated from the previous variable declaration by a comma rather than a statement-ending semicolon. (b) The JavaScript .pop() array method "pops" the last item off the array and returns it; so it's no longer part of h2Array, but its value is stored in the h2Last variable. This is especially handy for us, because we don't want the last word to appear twice.

Line 6 joins the remaining array items with a space between them and then appends the non-breaking space and the popped last item. Line 7 dumps that concatenated string back into the title, inside <h2><a></a></h2>.

String

The next approach involved working solely with strings, using the slice and lastIndexOf methods to split the the string into two pieces — one leading up to the last space, and one immediately following the last space.

JavaScript:
  1. $(document).ready(function() {
  2.   var h2all, h2a, h2b;
  3.   $('h2 a').each(function() {
  4.     h2all = $(this).text();
  5.     h2a = h2all.slice(0, h2all.lastIndexOf(' '));
  6.     h2b = ' ' + h2all.slice(h2all.lastIndexOf(' ')+1);
  7.     $(this).html(h2a + h2b);
  8.   });
  9. });

As line 7 demonstrates, the two sliced strings are stitched back together to keep the last two words on the same line.

Regular Expression

The final technique is the one I ended up sticking with, partly because it's the tersest and partly because I have a fondness for regular expressions:

JavaScript:
  1. $(document).ready(function() {
  2.   var h2Text = '';
  3.   $('h2 a').each(function() {
  4.     h2Text  = $(this).text().replace(/ (\w+)$/,' $1');
  5.     $(this).html(h2Text);
  6.   });
  7. });

The distinguishing feature here is line 4, which uses the replace regular-expression method. This method takes two arguments, a pattern to match against and a replacement value. For the pattern, which appears between the two slashes, we first match a space and then match one or more "word characters" (letters, numerals, or underscores). The parentheses capture all but that initials space, and the "$" at the end ensures that the match appears at the end of the string. The replacement argument starts with the non-breaking space and follows with $1, which refers to the first (and in our case, only) parenthetical "capture group." (Please forgive me if I've provided too much detail about the regular expression. I'm never quite sure how much of this stuff is worth mentioning, but since this entry is targeting beginners, I suppose it's better to err on the side of too much.)

Update

A few people pointed out in the comments that my regular expression could be improved, and I agree. In particular, as noted by Art Lawry, the \w can be changed to \S (that's an uppercase S) to match any non-whitespace character. That way it'll match characters such as ö and ç as well.

By the way, all three of these code examples can be reduced in length quite a bit. For example, the regular expression example can be pared down from 7 to 5 lines if we don't bother with the h2Text variable and instead do something like this:

$(this).html($(this).text().replace(/ (\w+)$/,' $1'));

However, the code is usually easier to read and maintain (for me, at least) if the value is first stored in a variable.

Any suggestions for improvement here? Any other approaches that you would recommend instead? Leave a comment.


comment feed

24 comments

  1. Karl, that was very creative way to fix the widow issue, I had just resided myself to having them, but I guess I don't have an excuse now.

  2. This is the perl coder in me coming out, but couldn't you declare your variable in the loop and give it an or value?

    var h2Text = $(this).text().replace(/ (\w+)$/,' $1') || '';

    Would only save one line.

  3. Neat trick. I used a similar trick (though just by ramming the &nbsp; straight in) to keep post codes from breaking on the space.

  4. elpres

    Good idea. One suggestion though: You should perhaps add punctuation marks to the regex, something like :

    h2Text = $(this).text().replace(/ ([\w\?\.!…-]+)$/,' $1');

    (That's a regular dash, not a n-dash.)

  5. roman

    Nice solution, thanks!

  6. deci

    The search regex doesn't work with words containing html entities.

  7. @deci and @elpres

    For the search regex not matching HTML entities and certain punctuations, I think replacing \w (any word character) with \S (any non-whitespace character) should do the trick.

  8. Rob

    This is excellent, thanks very much.

    I'm now trying to adapt it to control widows within paragraphs of text as well.

  9. Trevor

    I'd be inclined to do this server-side in my template. I don't think that would be "polluting my markup with something that shouldn't be there." What I wouldn't want to do is change the actual titles in the database, but adding the markup server-side on the fly makes sense to me.

  10. Thanks, everyone, for the great comments! I think I like Art's suggestion the best for improving the regex: using \S instead of \w. I suppose we could also use [^ ], but \S is more to the point. Must have been the ugly American in me that prevented me from realizing the need to match against extended characters.

  11. Karl, one thing I also noted on this method to avoid widow. If you have an inline element nested to your heading like this

    <h4>heading goes here <em> July 7th 2008</em> </h4> 

    with some style applied to it then the inline element loses its style. I don't know exactly what happens but it bypasses the styles applied to that inline element. I thought I'd point it out since I just ran into this.

  12. Hi Juliano,

    Good catch there. Yeah, I didn't account for nested tags in the headings. If you were to do that, you'd need to change h2all = $(this).text(); to h2all = $(this).html();.

    • Oh, I'm so happy this was pointed out! I was just getting worried that all my links were becoming plain text when I implemented this, but this little change solved it all!

      Great script, thanks! =)

  13. Matt

    jQuery rocks! Thanks for the great articles!

  14. Why not use CSS property white-space: nowrap;?

    
      $('h2').each(function() {
        var h2Contents = $(this).html().split(" ");
        h2Contents[h2Contents.length-2] = '<span style="white-space: nowrap; color: red;">' + h2Contents[h2Contents.length-2];
        h2Contents[h2Contents.length-1] = h2Contents[h2Contents.length-1] + '</span>';
        $(this).html(h2Contents.join(' '));
      });
    
  15. MarcusT

    Presenting three different ways to do it is certainly a good way to get people switched on to how best to tackle something, but surely each incurs a different performance hit?

    My money's on the second (string-based) method as the best performing approach, but it would be helpful to provide benchmark results in your post so that each solution's performance can be taken into account...

    And I wonder what the best way would be of ensuring that if any of the three approaches are accidentally (or intentionally) run multiple times the changes are not reapplied each time, resulting in new lines of 3,4,5,etc words...?

  16. Andrew

    This is a great tip! I am glad that I found this....

    What if I wanted to change all hyphens to em dashes? And better yet in a specified paragraph class?

    Is this possible? How would you do that?

  17. Hi Andrew,

    A quick way to do that with a regular expression would be :

    
    $('p.someclass').each(function() {
      var original = $(this).html();
      $(this).html( original.replace(/-/g, '—') );
    });
    

    Keep in mind, though, that this will mess up IDs or class names that have hyphens in them, if they're in those paragraphs. I don't have time to work out how to avoid that at the moment, but if someone else would like to give it a shot, go for it. :)

  18. Andrew

    I just noticed that this does not work if the last word is a link.

    Does anyone know how to fix that?

  19. Thanks for the great article. It really helped me in one of my projects.

  20. I've had some trouble with this script messing with img tags inside p tags. I've been using the following to avoid it but its not great because it just ignores paragraphs that have images...

    $('p:not(p:has(img))').html(function(i,html){
    return html.replace(/ (\S+)$/,' $1');
    });

    but works for most things..
    *note* requires jquery 1.4

  21. herostwist

    i use .replace(/\s+([^\s]+)\s*/, '&nbdp;$1') this regex copes with all punctuation and special charcters and also trailing spaces before closing tags.

  22. Apologies for being a stickler, but these aren't widows, they're orphans.

    A widow is a line of a paragraph that is too long that gets pushed onto the following page, resulting in a single line at the top of the next page.

    An orphan is a single word that is pushed off the end of a line by word wrapping and ends up on its own on the following line.

    Again sorry for being a pedant!

2 Pings

  1. [...] site Learning jQuery propose trois méthodes pour appliquer un espace insécable   entre les deux derniers mots d’un titre pour [...]

Sorry, but comments for this entry are now closed.