Library Cataloging Rule For Mac And Mc

When using the online Cutter Tool, convert Mc and M' to Mac for all classification schemes, so that the authors or titles continue to be arranged the same as with the print Cutter table. Individual works by an author. YCAL classification is alphabetical by author and, under each author, chronological by date of first edition. American Libraries Canadian Libraries Universal Library Community Texts Project Gutenberg Biodiversity Heritage Library Children's Library. Open Library. Books by Language Additional Collections. Featured movies All video latest This Just In Prelinger Archives Democracy Now! Name changed by AACR2. Even if the last name is changed by AACR2, you should continue to use the author Cutter to the preAACR2 form. Filing rules. When shelflisting, keep in mind some significant filing rules that affect alphabetic sequence of author Cutters.

Aacr2 Rules For Cataloging
Library Cataloging Rules
Library Cataloging Rule For Mac And Mccoy
Cataloging Books In A Library

< Wikipedia talk:Categorization of people (Redirected from Wikipedia talk:Categorization of people/Mac and Mc)

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 4

Archive 5

Archive 6

Ordering and sort-keys

I would like to add several new statements to the guidelines, but thought I should lay them open to debate first:

Capital letters which are not at the beginning of a word should be converted to lower case, so that James BeauSeigneur should sort as 'Beauseigneur, James', not 'BeauSeigneur, James'.
Punctuation (including hyphens and apostrophes but not spaces) should be stripped out of names for sorting purposes, such that Maurie Fa'asavalu sorts as 'Faasavalu, Maurie', not 'Fa'asavalu, Maurie', and D:Fuse sorts as 'Dfuse'.
Accented characters should be replaced with their unaccented counterparts: Uldis Bērziņš sorts as 'Berzins, Uldis', not 'Bērziņš, Uldis'.
Disambiguating terms should be omitted: Will Smith (comedian) sorts as 'Smith, Will', not 'Smith, Will (comedian)' or 'Smith (comedian), Will'.
All parts of the title should be included except disambiguating terms; no names should be abbreviated or omitted, including middle names, where they form part of the title.
Similarly, the sort key should not include any term which does not appear in the title. For instance, where an article's title is a nickname or stage name, the article should not be sorted by the person's real name. Also, if the title is an bbreviated name such as 'Don' or 'Chris', the article should be sorted by that name, and not the long version ('Donald', 'Christopher').
Suffixes should be placed at the end of the sort key, such that Robert J. Smith II sorts as 'Smith, Robert J., II', not 'Smith II, Robert J.'
Second and subsequent capital letters in a series of capital letters should be converted to lower case, such that RJD2 sorts as 'Rjd2'.
The prefix 'Mc' is generally sorted as 'Mac', such that Anna McCurley sorts as 'Maccurley, Anna', not 'Mccurley, Anna' or 'McCurley, Anna'.

Php startup unable to load dynamic library php_intl.dll mac download. Any comments? --Stemonitis 13:19, 6 January 2007 (UTC)

Re. #1: Not needed imho. Would this make an essential difference? We try to avoid complexity in guidelines where there's no real issue to address.

Re. #2: Seems useful, apart from the 'hyphens' which I wouldn't change (in other words: keep them in the sort key as they are in the page name). Note also:

This is not a 'people names' exclusivity, so should go in Wikipedia:Categorization#Category sorting;
Attention should be drawn, however, that the current 'people' sorting rule usually introduces punctuation (comma) where there wasn't one the in the original article name, and also that the comma in, for example, Charles de Secondat, baron de Montesquieu is retained. Plus the comma added that example uses two commas in the category sort key ([[Category:Enlightenment philosophers|Montesquieu, Charles de Secondat, Baron de]]). Note that this is an example currently in the categorization of people guideline. I also don't think that geographic names using a comma in the page name (e.g. London, Ontario) omit that comma when sorting. Maybe the simplest would be to make exception for commas for the 'no punctuation' rule. Anyway, if 'commas' are a general exception, this rule should move to Wikipedia:Categorization#Category sorting entirely.
Also, I see that you kept the period in e.g. 'Smith, Robert J., II'. This needs some refining if we want to use 'remove punctuation' as a more or less general rule.

Re. #3: In fact already included. Since it's not only 'people' that may have accents in their name, the issue is discussed at Wikipedia:Categorization#Category sorting. These general rules are linked from Wikipedia:Categorization of people#Ordering names in a category.

Re. #4: Not needed imho. 'Smith, Will' and 'Smith, Will (comedian)' would generally lead to the same sorting result. Warning against 'Smith (comedian), Will' seems a bit like treating our editors in a childish way. WP:BEANS.

Re. #5: Not needed: redundant redundancy.

Re. #6: Not needed imho, self-evident.

Re. #7: Useful. Since you used the generic term 'suffix' it also encompasses the 'disambiguating terms' of #5 (so that the more complex and in fact contradictory formulation of #5 is certainly not needed separately).

Re. #8: No. This would be an issue for Wikipedia:Categorization#Category sorting anyhow (I mean, consecutive capitals and letters in a name is not a 'people' exclusivity). And then I'd oppose it. There's no common sense in it, imho. This would, to name only one of the multiple problems this would create, make the current {{3CC}} template impossible to use, and then after all the instances of where this template is used are manually converted, the sorting order in the Category:Lists of three-character combinations would be exactly the same as it is now. Solution in search of a problem.

Note that for stage names etc using non-standard letters in their name (like for instance DJs but the same could probably be said about wrestlers etc) I do believe sometimes a bit of creativity is needed, but the kind of creativity that is hard to catch in rules:

Use of {{DEFAULTSORT:Rjd2}} in the RJD2 seems perfectly OK to me (didn't even know about the existence of {{DEFAULTSORT:..}}..);
?uestlove: [[Category:American hip hop musicians|Questlove]] and the first entry of List of hip hop DJs and producers#0–9 show a different sorting order.. I wouldn't know which is more appropriate..
[[Category:Hip hop DJs|DJ Quik]] and [[Category:Hip hop DJs|Cam, DJ]] reflect different approaches too (note that both are under 'D' in List of hip hop DJs and producers). Also here I wouldn't know what works best in general. Couldn't the DJ-interested people agree on this? Or try to find out which way this is done most often in English? For example, check a few record stores, and see how the CD's are usually sorted?

Re. #9: Don't know about that one for sure: what is most habitual in English? Unless it can be clearly demonstrated that in general in English that is the way names are sorted, I'd reject it as redundant complexity. --Francis Schonken 14:15, 6 January 2007 (UTC)

These examples are all based on real situations that I've come across, so it's too late for WP:BEANS to apply. Of all of them, the internal capitals (1 and 8) is the most important problem, in my opinion. The MediaWiki software sorts all capitals before all lower-case letters, with the result that names like 'DeWoskin' would be sorted before 'Deac', which is clearly at odds with alphabetical order. The ideal situation would be for sort keys to be in all-capitals, but we've gone too far with the current system to do that. The best approximation is to consistently capitalise the first letter of a word only. Similarly, punctuation tends to come before A-Z/a-z, which also screws up the sorting, though I would perhaps be prepared to relent on the hyphens.

Perhaps there is some redundancy here, but I wanted to explain my ideas fully. I'm open to suggestions for better wordings. If you think it would be clear that 'suffix' also includes 'disambiguating term' (I wouldn't consider the disambig. a part of the name, but would consider the suffix as such, for instance, so would interpret the two differently), then that's fine. I see, however, no harm in reiterating rules held elsewhere. If it's important to strip out accents (and for sorting in categories, it obviously is), it would be appropriate to have a statement (like my 3), and link it to the appropriate guideline (to prevent real or apparent conflict between the two guidelines).

In response to your comments to 4, 5 and 6, I can only re-iterate that all these mistakes have been made in the past (and in good faith) and will almost certainly be made again. If that could be prevented by the inclusion of a short sentence here, then I see no reason not to. Providing too much information is usually better than providing too little. For the 'Mac' question, we probably need to check how other encyclopædias do it; I haven't any to hand. --Stemonitis 14:56, 6 January 2007 (UTC)

Re 'real situations', yes, I kinda noticed. For that reason I elaborated my answer to #8 a bit, but ended up in an edit conflict. Anyway, you can see my elaborations above, resuming (partly) to: it's not possible to catch everything in rules. Rules don't outdo using your common sense.

Also, if 'mistakes' have been made, some of these are so evident, that it would be unnecessary to write guidelines about them. Typos are corrected by the dozen every second, yet there is no guideline listing all possible typos, warning against each of them separately. Indeed summing up 'most popular' typos in a guideline would be somewhat of a WP:BEANS approach, while even for popular typos most people don't make them. --Francis Schonken 15:22, 6 January 2007 (UTC)

English usage on the alphabetization of Mc varies extravagantly; I have even seen it alphabetized as a separate letter, after M. It is probably simplest to let them fall between Mb and Md - because that is what editors who haven't looked at this guideline are most likely to do; and for some names the distinction between McX and MacX represents a difference of families. SeptentrionalisPMAnderson 18:40, 10 January 2007 (UTC)

Yes, this is what I tend to do at the moment, and I have left that recommendation out from my new draft (see below). --Stemonitis 18:42, 10 January 2007 (UTC)

On the same grounds, I'm not sure about the decapitalization of internal capitals, which will at least sort as letters. Internal š, say, is different; it will sort in a place that is definitely wrong; but there's nothing really wrong with sorting Macbean separately from MacBean; and, again, trying to enforce this will mean an awful lot of corrections of what editors will naturally do. (And for not much profit; most categories won't have more than one Scot, so that, although we will have to correct all the MacBeans, for most of them it will have no effect on the category.) SeptentrionalisPMAnderson 19:28, 10 January 2007 (UTC)

Most people will not realise that the software effectively uses two non-overlapping alphabets. B comes between A and C in alphabetical order, regardless of whether it's uppercase or lowercase, so there is definitely something wrong with sorting Macbean after MacWilliam or MacBean before Macalistair. I don't believe that any reputable reference work puts all the people with a capital letter after the 'Mac' before all the people with a lower-case letter after the 'Mac' (the dictionary I have to hand certainly doesn't), and I can't really see any argument in favour of it. (In fact, the two alphabets do not even abut; the characters '[', '^', ']' and '_' appear between them, although those characters will rarely appear in article titles, and especially rarely straight after 'Mac').

The argument that many categories will contain at most one Scot does not hold water. There are plenty of categories with many people whose names begin with 'Mc' or 'Mac', and it would be helpful if they all sorted in the same way. Any kind of consistency will require a great number of changes, but that doesn't mean that we shouldn't make them, and it certainly doesn't mean that we shouldn't work out what they should be. It is to be hoped that editors who are unsure about how to categorise biographical articles will come to this page to find clarification, and find it, and so the consistency will trickle through over time even if no direct action is taken. And for those cases where it doesn't make any difference, no-one is obliged to make any change, so there's no harm done. --Stemonitis 22:54, 10 January 2007 (UTC)

OK, I've tried to cover all of the things I was trying to say into as concise a form as possible. My sandbox contains a draft of the relevant section, with my additions mostly confined to the last three bullet points. I think the extra explanation is helpful, because a lot of people seem to be confused as to why these changes are necessary. The short sentence 'The sort key should mirror the article's title as closely as possible' covers several of my ponits above; I wish I'd thought of that wording before. I have also tidied up the remaining text a bit, and tried to make the layout more readily comprehensible. Any comments would be greatly appreciated. --Stemonitis 17:31, 10 January 2007 (UTC)

No-one has made any complaints about the draft, so I'll copy it across (but removing the 'In categories dealing with British peers' qualifier which was included there). --Stemonitis 12:32, 12 January 2007 (UTC)

No approval either! I thought it too scetchy to give it a serious tought - didn't you see the evident errors? No, Louis IX is not a British peer, etc, etc. --Francis Schonken 13:02, 12 January 2007 (UTC)

It would probably have been better if you'd mentioned this before, but at least I'm getting the feedback now. Other than the Louis issue, are there any other changes you'd like to see made? Little things like that can easily be changed (for instance, by generalising the sub-heading to 'Nobility'), and at any time, including after the new text is included. --Stemonitis 13:31, 12 January 2007 (UTC)

Sorry, no, I don't think the rough draft at User:Stemonitis/Testing ground worth any further consideration in this stage. If you'd be able to get the *obvious* (as in: should be obvious to anyone) errors, typos and internal contradictions out of it, that might be a step forward, and I might give it a second look. But then please also don't run ahead of other discussions regarding the changes you'd like, but which are currently far from approved (see the comments by myself and by others on this page). --Francis Schonken 15:47, 12 January 2007 (UTC)

For goodness' sake, if they're that obvious, then please tell me what they are. I have read through it, and didn't spot any typos, for instance (which doesn't necessarily mean that there aren't any). And I have heeded the comments on this page. It was too wordy and included redundancies; now it is shorter and better. It was thought that capital and lower-case letters both sorted as one alphabet; this is not the case. It was unclear whether there was consensus for forcing 'Mc' to sort as 'Mac'; that was removed. In the absence of any further or more detailed comments, it is difficult to know where the problems lie. And may I add that taking the attitude that improving it is beneath you is really quite galling. If you don't want to help, then that's fine, but please don't hinder. I can re-order the bullet points quite easily (perhaps you think that the suffixes issue belongs under 'Sort by surname' or that 'Augustine of Hippo' belongs under 'Other exceptions'), so I don't see that as a major stumbling-block, either, even if the order is currently imperfect. --Stemonitis 16:06, 12 January 2007 (UTC)

Since no concrete suggestions have been proferred, I have reinstated my improved text. I hope that any problems that are found with it can be worked out without recourse to large-scale reversion. --Stemonitis 11:55, 16 January 2007 (UTC)

Your changes were an improvement over the current state, especially in organization. The textual changes are minor, so I'm not sure what Schonken's problems are. The existing text is flimsy, but that is another point. Your two additions, eliminating punctuation and lower-casing internal capitals, lead to an alphabetization that, as far as I can tell, confirms with English-language reference works and the MARC standards, so they are recommendable. --Afasmit 13:24, 17 January 2007 (UTC)

@Stemonitis:

Since you didn't stop pressing me, I took a closer look at User:Stemonitis/Testing ground, and improved it to the best of my abilities. You will see that a lot of the improvements you have proposed are no longer there, for the reasons I explained above, mainly that they are not specific to names of people, so they should be discussed at Wikipedia talk:Categorization, and in the case a consensus for them would emerge, they should be implemented at Wikipedia:Categorization#Category sorting.
Note that even for those of the new principles proposed by you I'm mildly positive about: *they need work*, among others the 'remove punctuation', is vague, the way you desribed it contradictory (while not mentioning the commas etc); the 'Decapitalisation after first capital' rule makes no sense in more than 90% of the cases, we *have* categories where everything is sorted capitals only, etc, etc, (all this are repeats of what I already said, and which was not implemented yet in any version of your proposals)..
Again, take these new proposals to Wikipedia talk:Categorization, they are not specific for articles on *people*, so they don't belong in a guideline that is exclusively on the categorisation of people: first you need consensus on general collation rules for categories, before any of this would be implemented in the guideline on categorisation of people. --Francis Schonken 01:36, 18 January 2007 (UTC)

Francis, forgive me if I seem underwhelmed. A lesser person might have been insulted by your edit to my draft (which replaced the entire text and further improvements with the existing text, without any attempt to incorporate any of my hard work). I will nonetheless try to assume that you were acting in good faith. The issue about removing punctuation is not important enough to warrant full-scale reversion; perhaps a clarification along the lines of 'and then' would be enough to indicate the (already implicit) order of events. (Furthermore, the examples would probably clear up any potential confusion.) Your assertion that internal decapitalisation 'makes no sense in more than 90% of the cases' seems not to be backed up with any evidence. My own (extensive) experience suggests that it nearly always causes a problem. Names such as LeXxx, DeXxxx, and so on are routinely mis-sorted. I cannot remember a name with internal capitals that did not need to be manually resorted. But this all leaves the biggest point, namely that the rules could be applied more widely. Well, yes, perhaps they could, but I don't see that that stops them being implemented here, and, more importantly, I'm not sure that they do apply so generally that they would make sense at WP:CAT. Rules that apply to human names need not apply to microchips or automobiles or other types or article. I have seen no rule stating that all guidlines must apply at the highest level possible, and I can't imagine that such a rule would be helpful. If this is the limit of your criticism, then I really can't understand why you insist on reverting. Nobody owns the text, which can be improved as long as the changes represents consensus. I believe that my draft does represent consensus here, since you are the only person opposing it, and you don't seem to want to help improve the text. --Stemonitis 02:28, 18 January 2007 (UTC)

Sorry, no, that is not how it works. You can't force the rules in Wikipedia:Categorization via the back door of Wikipedia:Categorization of people. If you continue to try that, I think it would be better to move the content of this section to Wikipedia talk:Categorization.

Note that many categories contain both people names as entries of articles that are not person names. It made me think about this old joke of a dictatorial president-for-life who allegedly visited the UK, and was impressed by the cars driving at the left side of the road. So he came home and ordered the cars to drive left in his country, adding: 'if it works well, we'll order in a few weeks the same for trucks'. A bit exagerrated, but I hope this made clear what I think.

Note also that Wikipedia talk:Categorization is a much more active talk page (I mean, in terms of number of people participating), with a high average experience in categorisation issues. Two people against one, plus some additional criticism by PManderson (like it is on this page) is not going to establish 'consensus' anywhere near soon on this. --Francis Schonken 09:34, 18 January 2007 (UTC)

Ordering of surnames in specific cases: Mc, O' and St.

I think specific mention should be made of surnames where the alphabetical indexing is different to the strict spelling of the surname. The three specific cases are mentioned in the section heading. I have in front of me a late 19th century Whitaker's Almanack listing the members of the House of Commons which includes:

Lyell, Leonard
M'Arthur, Wm. A.
Macartney, Wm. G. E.
M'Calmont, James M.
M'Cartan, Michael
M'Carthy, Justin
McDermott, Patrick
Macdona, John C.
Macdonald, J. A. Murray
MacDonnell, Dr. Mark A.
M'Ewan, William
Macfarlane, Donald H.
McGittigan, Patrick

. and so on. Were this expressed in our category indexing, anyone with a surname beginning in M', Mc, or Mac, would be indexed with their surname beginning 'Mac'. Another case is surnames beginning with O', and here the list goes:

O'Driscoll, Florence
O'Keefe, Francis A.
Oldroyd, Mark
O'Neill, Hn. Robt. T.
Owen, Thomas

This means that surnames beginning with 'O' should be indexed without the apostrophe. Likewise surnames beginning with 'St.' as an abbreviation for 'Saint' should have 'Saint' spelled out in the category indexing. I propose to make this addition to the policy, but would be interested to know if anyone wishes to object. As I see it, this is not a controversial issue. Sam Blacketer 09:35, 24 February 2007 (UTC)

I'm with you on the O'Xxx issue, but the Mac variations may be more debatable. Sorting all of Macxxx, MacXxx, McXxx, Mcxxx, M'Xxx and M'xxx as Macxxx is bewildering for people unfamiliar with the practice. It would be more transparent, albeit perhaps less traditional, simply to sort by the letters included in the name, without punctuation, and without any internal capitals. Abbreviations like St. are somewhere in between 'O' and 'Mc' in clarity, but are probably sufficiently widely understood to be sorted as 'Saint Xxx' even when titled 'St. Xxx'. However, this does bring up the overlooked problem of spaces in surnames: 'Saint John, Forename' sorts before 'Saint, Surname', because space sorts before comma in ASCII, but I doubt that any reputable reference work (except Wikipedia) follows such a scheme. --Stemonitis 17:19, 25 February 2007 (UTC)

On 'St.', I have certainly seen alphabetical lists produced recently where 'St. ' was indexed at the beginning of ST, but presumed it was always an error caused by computer sorting not recognising the abbreviation. Certainly, a surname of 'Saint' should be before 'St. John'. On the 'Mc' problem, I wonder if this is a case of national variations? I have not seen any alphabetical list which has separated them which could not be attributed to a computer's misrecognition. Sam Blacketer 11:28, 26 February 2007 (UTC)

I wonder if it's possible to distinguish between the naïvété of a computer and a deliberate, conscious simplification. Sorting 'McX' after 'Mbxxx' could easily be someone's deliberate choice. I guess we can only mirror the practice used by Britannica and other encyclopædias, whatever that might be. The only solution I can see to the multi-word surname issue is to replace the spaces (ASCII 32) in the surname with a character that would sort after the comma (ASCII 44) but before any (lower-case) letters, and my recommendation would be either to use the underscore ('_', ASCII 95), so that Ian St. John would be sorted as 'Saint_John, Ian', for example, or to simply run the words together so that Ian St. John would sort as 'Saintjohn, Ian'. I am not sure which approach I prefer. Again, it probably depends on the usage preferred elsewhere. --Stemonitis 17:36, 26 February 2007 (UTC)

Ordering of Mac, Mc and M'

This came up is one of my early edits, when I knew less than I do now about Wikipedia. For ease of finding a name, even when the correct spelling is not certain, it is common practice to sort them all as if they were Mac. As described above this can be before the other Ms or after Mab.. The Telephone Directory places them all ((including names such as Mace) after Mabbott and sorts all as Mac, but that is hardly a definitive source! I am told that this is documented in a 'standard text' for librarians, but have not sourced that. All I have found on-line is Everything - third section; second bullet. Collation indicates that this practice may have fallen out of favour since computerisation. I agree that much of the time it is an unnecessary complication; even in List of Scots there do not appear to be a vast number of Macs and Mcs (m' less common nowadays). However I believe we should have an agreed style. Comments? Finavon 22:16, 11 June 2007 (UTC)

I would also advocate having all McX names sorted as Macx, but it's a lot of work, and I think we would need to establish a wider agreement (perhaps including groups like Wikipedia:WikiProject Scotland) before insisting upon it. There are a lot of McX articles still sorted as McX, and it might be suitable as a bot task once a general agreement had been reached. There are also good reasons for not sorting McX as MacX, so I don't think we should be too rash. --Stemonitis 07:07, 19 June 2007 (UTC)

I've just recently been applying the current guideline, WP:Categorization_of_people#Ordering_names_in_a_category, specifically the bit that reads: The first letter of each word should be in upper case, and all subsequent letters should be in lower case, regardless of the correct spelling of the name to a number of 'Mac'/'Mc' articles when I noticed that there were too many 'Mc' articles sorted as 'Mac' to be coincidence. Assuming I'd missed a policy somewhere, I went looking, but I haven't seen it anywhere. This talk page seems to have the most serious discussion on the issue of whether 'Mac' and 'Mc' should be split or merged.

It's my impression that the merge approach, of having them all sort together regardless of actual spelling, is an older style that's fallen out of fashion, probably a victim of the shift from hand collation to machine collation. I used to see address books with a separate 'Mc' tab, the 15th edition of the Encyclopedia Britannica (circa 1984) used merge (and had a one line assertion that it was the right thing to do in a well-constructed index, in the Mac article). I also used to see small-town phone books that had a separate 'Mc' section.

However, most recent things that I can find have gone with split:

my local phone book (Austin, Texas)
the online Encyclopedia Britannica
the Author's Index of The Cambridge History of English and American Literature
a Library of Congress index of 19th century authors
online Worldbook Encyclopedia
Webster's Dictionary see [1] versus [2]

I do find some holdouts for merge:

the Columbia Encyclopedia (which is the same content as Yahoo's encyclopedia)

Personally, I favor split because:

a) it seems to be the current trend, see above.

b) it's a straightforward mechanical rule that doesn't require any judgment calls. While Wikipedia guidelines shouldn't favor simplicity to the significant detriment of the encyclopedia, when it's a relatively free matter of style, the simpler guideline should be chosen. There's already a significant learning curve to becoming a good editor; I don't think adding more special cases helps.

c) Adopting the merge approach will lead to secondary rules that have to be made. Consider Dick and Mac McDonald; under a merge approach, they sort as 'Mac'; what then do we do with the giant corporation named for them, McDonald's, sort it the same or different? And articles derivative of the company name, such as McJob and McMansion? We could have one rule for people and another for non-people, but that produces absurdity, with McMurdo Sound and Archibald McMurdo being separated.

d) It strikes me as being a slippery slope or Camel's nose. There are, for example, Chinese names that have several alternate romanizations, such as 王, which has been romanized as both Wang (surname) and Wong (surname). (And there's a different chinese name that commonly romanizes as both 'Wong' and 'Huang'.) It could easily be argued that Chinese historic names ought to be sorted under a canonical name that represents some chosen standard romanization, rather than whatever happens to be the historical accepted romanization, but then we'd be arguing about which is best, for dozens of names, and new editors would have even more rules to learn. (We already have folks arguing that the historical romanization for Mehmed II ought to be replaced with the contemporary Turkish romanization 'Mehmet'). I think accepting a split approach would be to throw an Apple of Discord, giving everybody with a socio-linguistic axe to grind that little bit of inspiration and ammunition to keep flogging their cause (but maybe I'm just paranoid). And no one seriously considers sorting 'Derby' and 'Darby' together, even though they evolved from the same name (and are pronounced the same in some parts of the world, though not mine).

e) It could possibly be argued that using the phonological argument for grouping 'Mac' and 'Mc', when we don't use phonological sorting anywhere else, is quietly promoting Scottish nationalism. While I'm proud of my one distant ancestress from Scotland (a McDiarmid), I don't think the Scots need special treatment.

Since it's a reasonable argument to have, given the historic precedents of sorting both ways, whatever we come up with as a consensus should be memorialized in the guideline. Studerby 21:11, 8 July 2007 (UTC)

Of these five reasons, only a) and b) are really relevant. The case under c) that it would introduce inconsistencies is not especially significant since it is rare that any such pairs of articles would be categorised together, most categories being either biographical or non-biographical, and relatively few being mixed (I can't think of any, although I'd be prepared to bet that there are some). Sorting rules already differ between categories, and that's not a problem. Transliteration of Chinese (d) is utterly irrelevant, because we would sort on whatever romanisation is used in the article title. Finally, e) doesn't apply, because this is not really a phonological argument; it's entirely to do with traditions of collation, and Mc and Mac have traditionally been grouped together, whereas others such as Darby and Derby have not. Scottish nationalism has no bearing on the issue, for several reasons, including that Mc- / Mac- surnames are also Irish. I think it is still unclear whether the general trend among relevant works is to lump or to split (my dictionary has McX explicitly under MacX), and that is more or less the only criterion we should be using to judge. Simplicitly is vaguely desirable, but not at the cost of authority. Surnames and their collation are surprisingly complicated, and attempts to over-simplify are likely to be fruitless. Please, let us stick to the relevant factors and not get carried away with speculation and invention. I might also note that whatever is decided upon (if anything at all), most of the McX articles will be ill-sorted, because a large majority have internal capitals in the sort key. If there weren't so few other surnames beginning 'Mc' (cf.Leri Mchedlishvili and Guram Mchedlidze), this would be quite urgent. --Stemonitis 21:35, 8 July 2007 (UTC)

I agree with Stemonitis, and I always index both MacX and McX as Macx. This isn't just a a convention, it's a convention with a reason, because the spellings are not always handled consistently across a family or even for the same person, and there is no guarantee that an entry found somewhere for 'John MacCarthy' will not be listed elsewhere as 'John McCarthy', or (without capitalisation) as 'John Mccarthy' or 'John Maccarthy'. The consistent approach makes it easier to find articles, and isn't that the whole point of indexing? --BrownHairedGirl(talk) • (contribs) 22:23, 7 November 2007 (UTC)

There's been a recent discussion of this and other alphabetisation issues at Wikipedia_talk:Manual_of_Style#Alphabetization, which may be of interest. PamD (talk) 08:22, 10 February 2008 (UTC)

Oppose It's always best to spell the person's name correctly. As a 'Mc' myself, I find it nearly offensive when someone misspells my name wrong the first time. When I correct the spelling and they continue to do so, it just makes me think they don't care about accuracy. Some will find it offensive. This is a guideline--one I will forever choose to ignore and I beg all other Wikipedians to follow my lead.--Paul McDonald (talk) 01:54, 15 July 2010 (UTC)
- details I've put together the first draft of an opposition essay here.--Paul McDonald (talk) 15:34, 15 July 2010 (UTC)
- It's not a question of spelling the name correctly. The sort key never displays anywhere; it just controls sorting. --Auntof6 (talk) 23:13, 15 July 2010 (UTC)
  - response I get that. Why in the world would Wikipedia ever choose sort articles incorrectly?--Paul McDonald (talk) 02:54, 16 July 2010 (UTC)

'Incorrectly' is a matter of definition, not of Universal and Absolute Truth. It's long been a common sorting practice to separate out all the Mc/Mac names from the other M names; we're simply continuing that practice, one that most people are familiar and comfortable with. -- Jack of Oz .. speak! .. 13:20, 16 July 2010 (UTC)

'Incorrectly' is a matter of definition, and sorting 'Mc' the same as 'Mac' is incorrect. I know of no sorting algorythm that calls for equalization of different logical values. The other Mc/Mac names can be sorted differently from the other M names automatically because they are 'Mc' and 'Mac', which would be different from 'Morris' -- 'simply continuing' a practice because 'most people are familiar and comfortable with' is another way of saying 'we've always done it that way' which, of course, never makes it right.--Paul McDonald (talk) 18:07, 16 July 2010 (UTC)

When it comes to spelling, 'we've always done it that way' is exactly what does make it right! This is why it's as incorrect to spell your surname MacDonald, as it would be to spell your neighbour MacDonald as McDonald; in both cases, you have family tradition to support your particular variants. Some Mc/Mac surnames have a space after the Mc/Mac, and some protocols would regard these surnames as simply 'Mc' or 'Mac'; thus 'Mac Clellan' would sort before 'MacAlister'. Is that what you want? I also note your own username is Paulmcdonald, a name you yourself chose, yet you get nearly offended when someone misspells your name wrong the first time. What's the message there? -- Jack of Oz .. speak! .. 22:42, 16 July 2010 (UTC)

But the point that 'we've always done it that way' loses its force when you realize that WE don't do it this way anymore..even the US Library of Congress stopped doing it this way 20+ years ago - look below in the other thread (search this page for the phrase 'a couple of decades ago' to jump to it) for the email response I got from the Library of Congress when I asked them about it. Such a system might have made sense in former centuries when different people spelled the same names differently (even from page to page), but given that we Scots have been literate for a while (end sarcasm tag) and we really know how we want our names spelled, I cannot see any justification for continuing this archaic practice.
William J. 'Bill' McCalpin (talk) 18:51, 19 July 2010 (UTC)

Just a note: it isn't only entries for today's literate people that we're talking about, it's entries for people from all different times whose names might have been spelled differently in different places. --Auntof6 (talk) 20:00, 19 July 2010 (UTC)

Comment I'm reasonably sure that no one from the 1700's is going to look up anything on Wikipedia.--Paul McDonald (talk) 20:35, 19 July 2010 (UTC)

Of course not, but someone today might look up someone who lived then whose name might have been spelled in different ways. Not a big deal, just a thought I had. --Auntof6 (talk) 22:28, 19 July 2010 (UTC)

Point of order: This thread had a gap from Feb 2008 to July 2010. The topic was discussed again in 2009/2010 at #Mc_vs._Mac below. I'm not sure what the protocol is for merging the various threads on the same topic, but thought I'd alert you to it! It doesn't seem helpful to start again on a thread which is not the most recent on the topic. PamD (talk) 13:49, 16 July 2010 (UTC)

And see Wikipedia_talk:Manual_of_Style/Archive_116#.22Mac_vs._Mc.22_Discussion_again for another archived discussion. PamD (talk) 13:52, 16 July 2010 (UTC)

- Probably should have been archived. Ideas?--Paul McDonald (talk) 18:07, 16 July 2010 (UTC)

Does anyone else find it interesting that articles must have reliable sources but Policies and Guidelines are promulgated based solely on the opinions of the editors who voice their opinions, often in the face of reliable sources? It does no good to cite reliable sources, such as the Library of Congress, in these instances. Consensus, defined as those who can defend an opinion longer than those on the other side, will always take precedence over rigor and reliable sources. JimCubb (talk) 21:35, 16 July 2010 (UTC)

Mc vs. Mac

This seems to be discussed above, but is 'For a surname which begins with Mc or Mac, the category sort key should always be typed as Mac with the remainder of the name in lowercase' necessary? I mean, I get McD should be Mcd, but why do we have to auto-change from Mc to Mac? Is there something I'm not getting? Wizardman 04:42, 1 March 2009 (UTC)

Aacr2 Rules For Cataloging

The reason is that we want McDonald/MacDonald or McAdams/MacAdams to be listed sorted as the same name.Headbomb {^ταλκ_{κοντριβς} – WP Physics} 04:52, 1 March 2009 (UTC)

Hm. to me they're still different, but to each his own, it's not something i particularly care about, just thought i'd ask. Wizardman 05:10, 1 March 2009 (UTC)

They are different names (Macdonald also exists), but if you only have the spoken version, this sorting lets you look in just one place. It doesn't matter where there are only a few Mc/Macs, but makes a difference to manual searching when there are lots. Please don't change the visible name, just the sort name (so no auto-change!). Finavon (talk) 14:00, 1 March 2009 (UTC)

This is something I've always hated, actually. There's lots and lots of names that have a most-common spelling and less common variants, e.g. Thompson, Thomson, but we don't 'normalize' spelling for any of them, and would properly be reviled if we did. Mc/Mac was often treated differently from other spelling variations in the past, but it seems to be a usage that's dropped out of most common references, Wikipedia being the only current exception I happen to know. The Encyclopedia Britanica used to consolidate Mac/Mc, but no longer does so. Studerby (talk) 22:03, 14 April 2009 (UTC)

Library Cataloging Rules

I see your point about Thomson/Thompson etc. The Mc/Macs are in a slightly different category, though. Each well known Mc/Mac surname can have up to 4 variants: McDonald, MacDonald, Mcdonald, Macdonald (and possibly others such as M' Donald and those that have a space after the Mc/Mac). Britannica etc can deal with these effectively as they're written by a relatively small coterie of experts. WP is written by, potentially, everyone in the world. Native English-speakers have a hard enough time remembering which notable people are Macs, which are Mcs, which capitalise the first letter of the rest of the surname, and which don't - let alone people for whom English is a second or later language. It makes very great sense to me to look at a category and see all the Mc/MacDonalds etc grouped together. They still come out with the correct spellings. Brian McDonald appears before Charles MacDonald, who appears before Egbert Mcdonald, who appears before Simon Macdonald. If we sorted them under their exact spellings, the number of duplicate articles would rise significantly; the number of merges necessary to fix them would be too great; and it would be virtually unmanageable. I've identified various duplicate entries by simply resorting any Mcs or Macs I come across as 'Mac', and decapitalising the first letter of the rest of the name. It's simple, effective, and once done, it stays done. -- JackofOz (talk) 22:18, 14 April 2009 (UTC)

We conventionally de-capitalize ALL non-initial letters (we have to, to get correct sorting for things like 'duBois'|'DuBois'|'Dubois'), leaving us with 'Mac'|'Mc'; I don't think it's too much to ask for people to look in 2 places in very large categories. The issue is irrelevant in most small categories, as they don't have enough entries under 'M' for the searched-for name to be missed. That said, consensus seems to be against me on this (although I don't think a formal effort to determine the consensus has been done), and I have no problem with following consensus or feel like agitating for a review - it came up, I commented.. A lot of lesser issues in sorting are matters of choice, and this is one. I don't think the current way is wrong, per se, I just really really don't like it and disagree with the common rationale for it and wish we were more like other major references in this regard. Part of the issue is really that Wikipedia needs to improve its finding aids; categories are a weak but necessary aid for finding people when you know a little about them but not exactly how the name is spelled. That is the only scenario (I can think of) in which the effort to sort all Mc/Macs together makes particular sense, and it does. Better finding aids would let us dispense with what I see as an oddity.. but we're not there yet. Studerby (talk) 19:47, 15 April 2009 (UTC)

Because users have trouble distinguishing Mc/Mac is no reason for Wikipedia to start altering the way these names are sorted. Thompson/Thomson, Chris/Kris, Tom/Thom, and Evenson/Evanson are other examples where we would not change the sort order. Dictionaries do not purposefully mis-order the word 'sophomore' simply because English speakers drop the middle syllable ('o'). This is senseless. It does not matter whether a non-English speaker can remember the difference between Mc/Mac or McDonald/Mcdonald/MacDonald/Macdonald. The information needs to be correct. Most English speakers do not pronounce the 'g' in 'Nguyen'; does this mean all Nguyens are now to be sorted as 'Nuyen'? No, but we'll mis-sort Mc/Mac names. Senseless. - Tim1965 (talk) 00:22, 17 May 2009 (UTC)

Library Cataloging Rule For Mac And Mccoy

Well, it depends what you mean by 'correct'. Firstly, all the individual names appear correctly spelled, wherever they appear in a list. Secondly, it's very common in the real world for long lists of names sorted by surname to start the M section with a sub-section for the Mcs and Macs, and the remaining M surnames follow. That's a useful system out there, and it's just as useful here. It's a convention to do it that way; it's neither more nor less 'correct' than having a list sorted strictly alphabetically. -- JackofOz (talk) 03:46, 17 May 2009 (UTC)

I too question the rationale behind this seemingly silly convention. What of Smith/Smythe, Nguyen/Winn, or any other hard to spell last names? The fact that one may not know which spelling is correct for a given individual is adequately solved by disambiguation. I don't see any reason that this would lead to duplicate articles, and I believe it's more correct to alphabetize people correctly than to worry about finding entries in large categories. I don't see any compelling reason that a non-paper encyclopedia would categorize names phonetically, and it's even more confusing when it's only done for one class of surname. Is there really consensus for this? Oren0 (talk) 03:41, 2 June 2009 (UTC)

I had trouble believing that this merging of Mc and Mac is a good idea (perhaps because my last name is 'McCalpin', I am entitled to have an opinion ;-) ). I noticed in the discussion at the top of the page that one part of the US Library of Congress uses the 'split' method (i.e., not treating Mc and Mac as the same spelling), so I asked their reference section what the policy was for the entire Library. This was the answer I got:

Question History:

Patron: General Inquiry:

Wikipedia sorts last names that begin with Mc- 'merged' with last names that begin with Mac-, as if the Mc- names were actually written as Mac-. This leads to a different sort sequence in their lists than any computer would generate and different than most people would expect.

Their argument is that this is the way it's been done for a long time, because of the variations and unpredictability in the spellings of Scottish names. However, others point out that fewer and fewer institutions are building their sorted lists this way, choosing to go for the more straightforward 'sort on the actual spelling'.

Which does the Library of Congress do and why? Do you know if there have been international standards on this subject by those in library sciences?

Thanks!

Bill McCalpin

(here follows verbatim reply from the Library of Congress)

Librarian 1: Thank you for consulting the Library of Congress's Digital Reference Section.

It is my understanding that the Library of Congress rules for filing catalog cards changed a couple of decades ago [emphasis mine] from filing all the 'M' ' 'Mc' and 'Mac' together ('as they sound' it was explained to me) and the Library now files them exactly as they are spelled.

The official filing rules may be found in the title 'Library of Congress Filing Rules' described here: < http://www.loc.gov/cds/catman.html#locfm > and available in print via the Cataloging Distribution Service at the Library of Congress for $10. The title is also available as part of the 'Cataloger's Desktop,' an online subscription resource for catalogers.

The text confirms that the current practice at the Library is to alphabetize exactly as words are spelled.

Section 1., 'Basic Filing Order' says the following:

(begin quote)

Fields in a filing entry are arranged word by word, and words are arranged character by character. ::This procedure is continued until one of the following occurs:

a. A prescribed filing position is reached.

b. The field comes to an end (in which case placement is determined by another field of the entry or by applying one of the rules given hereafter).

c. A mark of punctuation indicating a subarrangement is encountered.

1.1. Order of Letters

Letters are arranged according to the order of the English alphabet (A-Z). Upper and lower case letters have equal filing value.

(end quote)

After I have sent you my response, I plan to refer your inquiry to my colleagues in the Library's Acquisitions and Bibliographic Access Directorate (the most authoritative source of this information at LC), so they can confirm this information, and provide you with any additional information they may have on any incipient standards of which they are aware.

Another useful book that includes rules on alphabetizing, if you wish to explore this further is:

LC Control No.: 2005004214

Personal Name: Mulvany, Nancy C.

Main Title: Indexing books / Nancy C. Mulvany.

Edition Information: 2nd ed.

Published/Created: Chicago : University of Chicago Press, 2005.

Description: xiv, 315 p. ; 24 cm.

ISBN: 0226552764 (alk. paper)

You may want to check for this book at your local library.

We hope you find this information helpful. Good luck with your research!

Digital Reference Section

Ask A Librarian Service

The Library of Congress/lsg

So, could anyone give me a reason that actually makes sense to people who carry this type of last name why Wikipedia wants to sort differently than Library Science professionals, other than 'well, that's how people used to do it'? Also, who makes this decision and how is it changed?

William J. 'Bill' McCalpin (talk) 17:36, 5 February 2010 (UTC)

Mc.. vs Mac..

Why do we treat these two names as the same when the legal names have diverged and are no longer synonyms? As of now we force the sort key to always be 'Mac', but why? --Richard Arthur Norton (1958- ) (talk) 02:35, 15 April 2009 (UTC)

See 'Mc vs. Mac' above. -- JackofOz (talk) 06:36, 16 April 2009 (UTC)

Retrieved from 'https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Categorization_of_people/Archive_5&oldid=658925081'

Cataloging Books In A Library

LibraryThing
All topics
Hot topics
Book discussions
All discussions

Books
Authors
Series

Your LibraryThing
Join to start using.

Talk about LibraryThing

Join LibraryThing to post.

This topic is currently marked as 'dormant'—the last message is more than 90 days old. You can revive it by posting a reply.

Hello, LibraryThing. I'm not sure if there is a better place to post/raise this issue, but I've noticed a couple errors on LibraryThing that relate to proper alphabetizing/filing rules:
1) According to the American Library Association's 'letter-by-letter' rule:
http://www.ala.org/tools/libfactsheets/alalibraryfactsheet27
authors should be filed/alphabetized letter by letter. Adhering to this rule would look like this:
Oates, Joyce Carol
O'Connor, Flannery
That is to say, names with prefixes (such as 'O'Connor') should be treated as one word and alphabetized letter by letter. So, Flannery O'Connor would be alphabetized after Joyce Carol Oates because 'Oa' comes before 'Oc.' Anyway, I noticed this rule is not currently being followed on LibraryThing, as Flannery O'Connor is presently alphabetized before Joyce Carol Oates. Any chance of fixing this?
2) Carlos Ruiz Zafón, author of The Shadow of the Wind (read this book if you haven't, it's phenomenal!), is currently alphabetized on LibraryThing under 'Z' for Zafón, which is incorrect. Just like Gabriel García Márquez, who is correctly alphabetized under 'G' for García Márquez, Ruiz Zafón should be properly alphabetized under 'R' for Ruiz Zafón. This is confirmed on the copyright page of his works:
Ruiz Zafón, Carlos
Anyway, if there is a better way to alert the folks at LibraryThing of these issues, please let me know. I just thought they were worth bringing up in order to improve the accuracy of the site.

Why should LT follow the ALA alphabetizing rules? It isn't actually a library.

>1 erbisoeul: Where on LT are things alphabetized wrong? In your catalog?

You'll find that most libraries don't follow the ALA filing rules anymore, because computers don't file like that and most of us have computer-based catalogues. The ALA rules are for CARD catalogues. I stopped worrying about those the day I emptied the card catalogue drawers over the recycle bin (o frabjous day!)
Regarding Carlos Ruiz Zafron - how his name is filed depends on how it was entered. If the users entering those books entered it as Zafron, Carlos Ruiz, then that's how it will show. If you look at the Common Knowledge for his name, the canonical (official) version of his name has already been entered as Ruiz Zafron, and that's all that can be done to correct the error.

>2 jjwilson61: Because LibraryThing is a social cataloging system, which is used by, among others, libraries. To be an effective tool, it would make sense for the site to adhere to standard rules of cataloging.

>4 tardis:If the users entering those books entered it as Zafron, Carlos Ruiz, then that's how it will show.
Except that users probably didn't enter it directly but got it as part of a book record from some source. If you used Amazon as the source and it gave you bad data then you should just fix it in your catalog.

>5 erbisoeul: But most users are individuals who understand how computers alphabetize things. To use the ALA rules would probably cause more confusion than it helps.

>6 jjwilson61: I forget about the bad Amazon data thing because I don't use Amazon data myself - if I can't find it in Overcat or a library, I use manual entry. But the fact remains, the user who imported the data accepted it as-is rather than editing it to correct it. Most probably through ignorance - if you're used to standard names in most of the ENglish-speaking world, then Ruiz looks like a middle name.

Alphabetising by computer rules is straighforward and easy to understand. Non-librarians have always had trouble with ALA filing rules. Besides, this is not just a site for US members, and why should those members have to learn that way of doing things?

>9 MarthaJeanne: Plus, I expect that the programming challenges of making a computer use ALA filing rules are not worth the time it would take. And any time a member exported their data to Excel and sorted it, they'd lose the ALA sort again anyway.
At least LT drops 'The,' 'A,' and 'An' when it sorts.

Except on the combining pages now.
Totally agreed on the programming challenges. The nonstandard characters are still a hassle.

>11 MarthaJeanne: Really? Guess I haven't done any combining for a week or two. Hope they get that fixed soon!

>12 tardis: Nope. It was changed on purpose because the old code wasn't working and it was too much trouble to fix.

>13 jjwilson61: Dang!!

Tim asked and no one objected. The thread was in the Combiners group.

>4 tardis: If you follow the link I provided above, you will see that the ALA Filing rules are 'rules for the arrangement of bibliographic records whether displayed in card, book, or online format. These rules are 'letter-by-letter' (or 'character by character') rules. They also largely ignore distinctions among different punctuation marks and do not distinguish among the types of access points.' So, stating 'The ALA rules are for CARD catalogues' is erroneous, as you are overlooking their purpose as a whole. As far as your claim that 'most libraries don't follow the ALA filing rules anymore,' well, as a librarian myself, I admit you raised an eyebrow with that one.

>7 jjwilson61: Fair enough, but 'How computers alphabetize things' is technically not alphabetizing, which is what I have pointed out above. An alphabet is comprised of letters, not punctuation, which is why the ALA rules rightly ignore punctuation when alphabetizing. I admit I'm no expert on computer programming, but I find it hard to believe a little detail such as this couldn't be fixed by someone with the proper know-how. Also, properly alphabetizing words is hardly confusing.

>9 MarthaJeanne: I wasn't implying that LibraryThing was a site just for US members, although I am curious how other English-speaking countries alphabetize. Other than spelling variations, I can't see how it would be any different. I've looked around online and haven't been able to find an alternative set of rules. If you know of any, I would greatly appreciate a link.

>17 erbisoeul: I spent part of my career designing filing systems. There are without a doubt different ways of alphabetizing things, some of which are not compatible with the way computers 'alphabetize,' and some of which appear to be different from the ALA. There isn't a right or wrong way so much as there are conventions that need to be agreed upon in particular systems. If the clerk to the left files McGxxx before MacGxxx but the clerk on the right does the opposite (both ways being technically correct depending on the system) retrieval is going to be a beast. That's why a lot of filing systems use numbers, which can be sorted without ambiguity.
After all, there would be no need to have standards or guidelines if everyone did it the same way to begin with.

>18 erbisoeul: Why do you want to limit it to English-speaking?

If you're using a card catalogue or looking along a shelf, you need to know whether Oates comes before O'Connor, otherwise you waste time or risk not finding what you're looking for. If you're using a computer, you don't care, because you're never going to look for a known author name by browsing a list: you use the search tool, and all you care about is that the search tool doesn't freak out when you type in an apostrophe. Arguing about which set of rules we should use for alphabetical sorting is a bit like discussing how far apart the stables should be on the freeway. It's an interesting discussion, and it is something it would be nice to get right, but it isn't actually very important.

>20 MarthaJeanne: What was it that erbisoeul said that makes you think that?

Besides, I just looked at your link, and there are 2 different sets of rules, that give different orders! Reading it carefully, it says that you need to set filing rules, and that they can differ between libraries.

>22 Collectorator: See >18 erbisoeul:, where erbisoeul mentioned wondering about exactly that.

>24 jjmcgaffey:, yes that is the only mention of it I find. Must be a red herring.

>21 thorold: that's not always the case. If I'm unsure of the complete spelling but have some indication of at least the start than I'll peruse a list looking for the right item.
But I'm not sure that changes your point significantly.

Erbisoeul, your suggestion that LT should adopt an English-language character set for alphabetizing in order to 'fix' things does not take into account the international orientation of this site. Many users here have catalogued items from non-English languages, some of which are diacritic-heavy, and some of which contain (many) more characters than the 26 of English. Rules for alphabetizing are different in different languages, too, so alphabetizing by 'letter' is actually rather difficult to implement. (I'm ignoring computer-based alphabetizing for now).
I'll give you an example. Just for my catalogue, author names (and titles and tags etc) have French characters (é, è, ê, à, etc), Dutch characters (ë, ï), German characters (ü, ä etc), Swedish characters (å, ä, ö), Danish characters (æ, ø, å), Icelandic characters (ð and þ, etc) and Irish characters (ó, á, etc). The link you provided didn't give any information on handling diacritics (which is unacceptable) and I think it's highly likely that the American system you suggested would collapse ô, ö, ó and ò with o, and à, á, ä and å with a -- what else could they mean by 'letter-by-letter'? For some languages that is fine; but for others it most definitely is not.
Lt already has its issues with alphabetizing non-English characters (I'm ignoring non-latin alphabets for the moment, and non-alphabetic writing systems): it ignores the distinctions between ó and ö, for instance, and the alphabetizing is wonky because of that. Again using my catalogue as an example, the authors Ó Donnchada and Östergren should not be listed between A and B.
I do not think that adopting an English-language set of rules would 'fix' things; it would positively make matters much worse for many users. Remember that LT is not an American-only website -- or even an English-language only website, and the system must be able to accommodate much more than the ALA's letter-by-letter rules allow for. Restricting ALA-type rules to just the .com version of LT would not be a solution: many English-speaking users read multiple languages; and many international LTers use the .com version of the site, even when a version in their language is available.
Finally, when you say in >18 erbisoeul: 'I wasn't implying that LibraryThing was a site just for US members' -- maybe not, but when you suggest that an American set of alphabetizing rules should be followed and that they would 'fix' things, you are not only assuming (and therefore implying) that rules for alphabetizing English should be the norm, but even that they would be useful or even acceptable to all users of this site. And that is just not the case.

>27 Petroglyph: Sometimes I wish for a Like button here on LT. Hear hear!

The solution for non-English characters is the Unicode default sort, which should be easy for LT programmers to install. This would sort accents as secondary weights, only making a difference if the words are otherwise the same.
It's possible they can configure a custom sort, but anything smarter would be a lot more confusing for anyone not a serious polyglot; ñ is its own character, but ç won't be. å might its own character at the end of the alphabet, but ä can't be, because its biggest user is German, which doesn't sort it that way. (Or we don't put å at the end, because that would be more consistent?) It would be impossible for anyone without several languages under their belt to figure out how to sort a title, and a lot of languages would still not sort right.

I'm with 27>

I'm not entirely sure, but I don't really see why it would be necessary to adopt any kind of library-approved rule at all? I'm not talking about diacritics, umlauts and the like (and oh, how I would love a solution for sorting in my library!), but the given example, Ruiz Zafon - if in your library, it's sorted with Z, you'll have to change the your own author entry from Zafon, Carlos Ruiz to Ruiz Zafon, Carlos. I really can't see where else we would need authors being listed alphabetically? The LT entry lists him as Ruiz Zafon, so the 'error' is in your own library.
As for titles, I'm not paying too much attention to alphabetical listings on combine pages anyways. There are too many and way too many junk entries to have them all listed in a 'scientific' way. This is a site living by user entries, after all.
But I do want to add my voice in suggesting that we need a way of dealing with disacritis! Or maybe there could be an option of moving authors/titles inside one's own library. I have way too many Kästner's and Krüss's showing up all over my library to not care about that issue!

Of course the ultimate solution is to tag all your books with a sorting key and then sort by tags. (I'm not entirely unserious about this. My shelves are once in a while sorted by author and title, and I'm not really fond of the idea of disregarding a, an and the when sorting, so I'd prefer a more 'mechanical' sorting than most.)
If you want to sort on several fields this gets very messy very quickly, so it is not a feasible solution.
Another more technical solution is to export your data to a spreadsheet and then devise your own sorting. I do something like that only using a database rather than a spreadsheet.
Most of the time I just search to find a specific book and then I don't care as much about the sort order as long as it is approximately what I'd expect.
ETA: I'm adding some of Sue Graftons books and of course 'A for Alibi' sorts as 'for Alibi' which looks silly.

>16 erbisoeul: Well, I just checked, and our online catalogue does file by ALA rules, so your raised eyebrow is deserved :) In my defense (feeble though it is) the ILS software handles all of that so I don't have to think about it, and in my library we don't file by author (LC# plus cutter for main entry). I'm not sure if users often browse the catalogue by author, so I doubt they'd notice one way or the other.

>32 bnielsen: You do know about the double pipe for adjusting the sort? If you enter the title as ||A for Alibi, it will sort as 'A for Alibi' instead of 'for Alibi, A'. Similarly, if you have an article that LT doesn't recognize at the beginning of the title (or whatever), put the double pipe in front of what you want the title to sort by and it will. Doesn't help with diacritics etc, but if your problem is that it's sorting by the wrong word the double pipes are a lifesaver.

>34 jjmcgaffey: Thanks. I do know about the double pipe, but I'll live with funny sorting rather than pollute the title with pipes. BTW some of the library records also contain stuff like that, a is supposed to mean aa sorted like å, I think.
To me that's stuffing two things into one field and I try to avoid doing that.
But yes, it's a way of fixing 'A for Alibi' or 'A is for Alibi' if you have the English version. Actually LT could have implemented sorting so it would notice that 'A for Alibi' is a Danish edition, so eliminating A in the sort should not apply. But no doubt this would make LT more complicated and slower. I prefer simple rules, so I'll just live with the quirks of the current regime :-)

I have found the above discussion very interesting and have learnt a few new thing, eg. ALA 'letter by letter rule'.
Just wondering where Üni-Code' fits into this..?

>36 guido47: New readers can start here (Hint: You'll need lots of coffee or stronger for this).
http://www.unicode.org/reports/tr10/

I looked just a bit, and the result is (as we have known for a long time) there just isn't a way to sort that will please everyone. For example the character ö can be sorted, depending on language and context:
as o (usual in English and other languages that don't use it)
as oe (German for libraries, etc.)
after oz, i.e. next letter after o (German for phone books, etc.)
as one of several letters that come after z (Swedish)

Thanks >37 bnielsen:,
As a Once time programmer when I read Collation order is not a stable sort.
Stability is a property of a sort algorithm, not of a collation sequence. I realized I would indeed need a stiff drink. Not just coffee.

>38 MarthaJeanne:as oe (German for libraries, etc.)
Interesting. I didn't know that. Turkish seems to sort like German.
as one of several letters that come after z (Swedish)
Yes, from A-Z is from A-Ö in Swedish.

>35 bnielsen: The double pipe only shows up when the title is edited, not when it's displayed, so it shouldn't 'pollute' the title.

Maybe I did it wrong, but my double pipes showed up the day after I put them in, and they still show. Not in my catalogue, but on the main book page.
http://www.librarything.com/work/605934/book/29067462

They aren't supposed to. You should report it as a bug.

I think the double pipes are also stoved into the tab-export and I don't like stove pipes there :-)
>39 guido47: At least I hope I provided able warning so you had the ingredients at hand.

>44 bnielsen:
Speaking of the tab export, did your bug ever get fixed?

>45 lorax:. The bug is with the Marc export, and the answer is no, but it has been very well defined. Some 180 of my books are just missing in Marc export. All of them have been added with 'Det kongelige Bibliotek' as source so that's a smoking gun.
The protocol for fetching the Marc export is a bit weird, but I've finally gotten around to scripting it.
..
Columns: 33/(66242), Rows: 2/6201, rdbtable ok: lthing.rdb
- - checking 6201 books -- 922 tagged as recycled - - 6019 in marc format - -
- - checking 6106 books with cover artist descriptions - -
So it's 182 books missing as of this moment.

Rousing discussion everyone. I'm glad I could bring ALA rules to the attention of anyone who was unaware of them, and thanks to those who contributed to this discussion in a meaningful way. I agree that there is no method of cataloging that will please everyone, but that doesn't mean the issue isn't worthy of discussion. Just to be clear, I am an American librarian and I, of course, realize that these rules do not apply to all languages and was not suggesting otherwise.
>27 Petroglyph:'your suggestion that LT should adopt an English-language character set for alphabetizing in order to 'fix' things does not take into account the international orientation of this site.'
Most users, including myself, are aware that LibraryThing is comprised of multiple language sites, such as ara.librarything.com (Arabic) and jp.librarything.com (Japan), etc. When I suggested the implementation of ALA rules here on LibraryThing, I assumed individuals would logically deduce that I was referring to the main English site at www.librarything.com Apologies for not spelling this out in my initial post, but I figured it was self-evident.
'Rules for alphabetizing are different in different languages, too, so alphabetizing by 'letter' is actually rather difficult to implement.'
Transliteration, i.e. not difficult at all.
'The link you provided didn't give any information on handling diacritics (which is unacceptable)'
Diacritics are ignored, same as punctuation. They are also not exclusive to a particular nation as you suggest.
'Restricting ALA-type rules to just the .com version of LT would not be a solution: many English-speaking users read multiple languages; and many international LTers use the .com version of the site, even when a version in their language is available.'
When in Rome, do as the Romans do.
'..when you suggest that an American set of alphabetizing rules should be followed and that they would 'fix' things, you are not only assuming (and therefore implying) that rules for alphabetizing English should be the norm,'
Actually, it's a suggestion, not an assumption. I suggest ALA rules because they properly alphabetize the Latin alphabet, i.e. letter-by-letter, and ignore characters that are not part of the alphabet. As I have pointed out in my initial post, the implementation of punctuation, such as the apostrophe in 'O'Connor,' leads to errors in proper alphabetization. Furthermore, I did ask above if there were other ways of alphabetizing in English and, so far, no one has pointed to a concrete alternative (see 18).
'..but even that they would be useful or even acceptable to all users of this site. And that is just not the case.'
No kidding and, again, I never suggested this.

The different language sites only show different translations of the text on the site pages. It doesn't affect the actual way the site works, like sorting, at all.

>48 jjwilson61: I noticed this as well, at least in the French, Pirate, and Japanese versions I took a look at. However, surely there must be differences in sorting, at least in so far as native users are concerned. My own library remains unchanged when I switch to the Japanese site, as it was entered using the Latin alphabet and transliteration evidently does not occur. However, if someone from Japan is using the site and entering data in their own language (assuming this is possible), I wonder wouldn't there be differences in how that language is cataloged?

No - if you entered a couple Japanese books (in Japanese), they would be sorted in your library exactly the way they'd be sorted in a library that was all in Japanese. The same sorting rules apply across all the sites. Which means that the sorting rules on LT are extremely complicated and complex..which is what we've been trying to say.

>47 erbisoeul:Transliteration, i.e. not difficult at all.
That's horrible. I've never seen an alphabetized index or anything that sorted things by their transliterated forms, and I never hope to see one. Even in something like Russian, it could be quite frustrating to look for a Russian book, especially if you don't know exactly what transliteration is being used. Georgian? Converting თ to t and ტ to t', and then sorting them by ignoring apostrophes are going to be miserable.
The only reasonable way, IMO, to sort material in multiple scripts is to put the scripts in some order and sort them internally according to their own rules.
Diacritics are ignored, same as punctuation.
What about eth and thorn? What about the the other hundred or so non-diacritic characters in the Latin script?

-snip-

barf bowl

We want dd, ff and ll digraphs for Welsh, or we'll blow up yr mailboxes!

I knew it was only a matter of time before Meibion Glyndŵr found their way here.

And this after the Dutch convinced Tim that their digraphs should always be read as two letters no matter how they had been coded.

>56 MarthaJeanne:: Yes, because that's how our digraph ij should be read. You will find it under 'i+j' in any Dutch lexicon, even though it's sometimes - mistakenly - used as 'y'.

>57 Nicole_VanK: But in encyclopaedias and indexes it is alphabetized as 'y'. Yes, we are not very consistent.

>58 henkl: I myself would have no idea how to sort on all our weird letters. I admit I would sort IJ at Y, and OE at O, EU at E etc. No idea of the correct sort order.

I am a literal freak, I love reading this thread. :D
A long time ago, in a public school system in the US, we were taught alphabetizing (if that isn't a word, it should be, ha!) that apparently differed from what is being discussed here. Names such as O'Connor and O'Reilly were at the beginning of the O section, followed by O entries without an apostrophe. The same system was used for Mc and Mac prefixes. I thought it was used in our library, I'm fairly certain of it. This system was also used in phone books for many years, and might still if there are any phone books still being published and distributed.
Anyone else recall this system? I can't be the only one..

Yes, I've seen that system used in indices.

>60 fuzzi: Yep

Somewhere I was taught that 'Mc' names should be alphabetized as if they were spelled 'Mac'. Not sure what system that is..

>60 fuzzi: That's what I remember, too. And not just from school. In the early 90s, I was a file clerk and that's how I had to alphabetize.

The Mc/Mac rule was in use in libraries back when we were typing all the cards, and then alphabetizing all of them by hand. I think the decision was made somewhat arbitrarily, but in library classes we were taught to follow it. Then, when computers made their appearance, we found that they alphabetized letter by letter automatically and we had to change the way we did things.

All alphabetizing systems are arbitrary. All that matters is that they are internally consistent.

Ugh, when I started as a page in the library (wayyyyy back when) I got given the 'page' test. Where of course you have to put the books in the correct order. So I do, putting both Mc and Mac where they should go in a regular alphabet. Then I get told (to be fair my to-be boss was smiling at the time) that I only got one wrong on the test.
For some odd reason, at that time in that library we were filing both the Mc and the Mac authors... before even the Maa, or Mab authors.. *doh* :)

>67 DanieXJ: yes, that's the way it was done, apparently 'bc' (before computers).

>67 DanieXJ: That's how it was done when I was a file manager & records management specialist in the '70s & '80s.

>67 DanieXJ: Your post made me smile, because my son became a library page a few months ago, and we were teasing him because he had to take the 'page test'.
When I was a child, it always bothered me that Mc and Mac were shelved together.

I worked as a temporary secretary for a few years, and the question of 'which filing system' always needed to be asked early. O' first, or treated as O(space) (which mostly put it at the beginning, but not always), or treated as O(next letter)? Mc and Mac together, and if so with Mac or at the beginning or at the end? Did 'the' count in a name/title, or did it get ignored? I swear every office I went to (dozens!) had a slightly different arrangement of files.
I never worked in a library, but I've looked for books in plenty of different ones, and had the same 'where do _these_ people put them?' question.

I think the problem with many of the IMHO over-fancy schemes is that they only make sense to a select few. The Danish authorized filing rule used in dictionaries is that 'aa' sorts as 'å' (i.e. the end of the alphabet) if it is pronounced as one vowel, but as 'aa' if pronounced as two vowel.
The examples where this makes an actual difference are few and far between, but I've looked for the word 'kraal' (South-african origin, I think) and couldn't find it in the dictionary because it was filed at the bottom of the kr* words.
Hmm, maybe I should write a book and change my name to Kraal just to see where the librarians would put it. :-)
But it's possible they have another rule for author names.
My vote goes to simple rules. If a publisher calls an author McArthur on one book, and MacArthur on the next, I can live with looking for his books on two different shelves if need be. Even Evtjuchenko, Jevtjujenko, Yevtjuchenko would be ok.

Can you put the Title's of the books in alphabetical order. I did the Authors but would like to see the titles that way. i'm new here and just starting a small library . I thought it would make it easier for patrons that view my library online to find books if they were in alphabetical order. Any suggestions???
thanks

>73 Beckysbooknook: Just click on the word 'Title' at the top of that column. You can do that with most columns, and your visitors should be able to do the same thing.
If you want to sort by two things, say, by author and then by title, look for the icon with the one-up, one-down arrows, near the printer icon, above the catalog columns. You can sort in all different combinations.

Note that your patrons will have to sort for themselves. You can suggest a style, but that does not include the sort.
BTW when you change the sort by clicking on a column, the previous sort becomes the subsort, so you can aslso sort by author and then by title by clicking first on title and then on author.

At the bottom of a list or cover view is a Permalink. If you copied this, would it include the sorting preferences?
James

>76 Keeline: Yes, and it even seems to work now. It's been rather wonky.. It doesn't do subsorts (or rather, it has two sorts but they're always the same - I tried modifying the URL, didn't work either. The last one is what it sorts by).
>73 Beckysbooknook: So if you sort by title, then go down to the bottom of the page and copy the URL of the permalink and give that out, your patrons will see your collection sorted by title. They can then sort other ways, if they want, but as long as they use that URL to get to it it will start out sorted by title.

@73-77: I once wrote up an unofficial wiki page to document what can be encoded into the URL/permalink. Beckysbooknook, you might find it useful for constructing a link to your library.

Group: Talk about LibraryThing

179,630 messages

Jan 12, 2020 You can access the hidden Library folder without using Terminal, which has the side effect of revealing every hidden file on your Mac. This method will only make the Library folder visible, and only for as long as you keep the Finder window for the Library folder open. You can access the Library folder with the Go To Folder command or from the Terminal app. Access the Library folder in Finder: In Finder, choose Go Go To Folder In the Go To The Folder field, enter /library/. Find a file in library container mac and c. The files and folders in /Library are generally meant to be left alone, but if you’ve been using OS X for a while, chances are you’ve delved inside. Using a tip from Macworld, Mac OS X. To unhide your Library folder in macOS Sierra, launch Finder and navigate to your user Home folder (you can jump directly to your user folder by select Go Home from the Finder’s menu bar.

This group does not accept members.

About

This topic is not marked as primarily about any work, author or other topic.