Adblock is the single most useful Firefox plugin available today. Just like watching sitcoms with automatic commercial-skip, adblock’s banner ad supression system elicits a smug sense of satisfaction even after browsing through your 10,000th ad-free web page. However, a huge barrier to adoption seems to be the lack of a default filter set, so when you first install adblock, nothing happens.
The main issue is that adblock does not have any intelligence as to the content that is included with a webpage; it is just a generic regex-based filter system, so it is only as effective as the filters that you provide. There are plenty of pre-made lists available but they tend to be overly-aggressive in what is supressed, resulting in occasional broken pages and/or pages that dead-end because adblock has removed the “Next” button. The most dangerous public set seems to be the EasyList, which has a 360+ item block list. Evidence that the creators know of its greedy nature is their inclusion of a 20+ item whitelist to manually compensate what was initially blocked. Even more unstable is the EasyElement list that searches through the DOM to remove suspected elements directly from the main document — a list of 570+ substrings to search for.
Intead of using such a large, reactive list of simple and site-specific string matches that tries to supress 100% of ads, I posit that you only need 2 adblock filters to eliminate 70-80% of ads, and still be confident that legitimate content isn’t being flagged as a false positive. By getting into the heads of HTML writers, we can pick out the most common patterns used to include ads and create regex patterns to suppress the ads.
/(\b|_)ad(x|s?)(\b|_)//ad.*\d+[xX]\d+/At this point, your browsing experience will be significantly improved, but you can bump up your block rate to about 80-90% with a few more simple substring matches. There are many well known ad providers that exist solely to deliver ads, so we can consildate those in composite filter rules:
/a(2\.yimg|dserv|dvert|tdmt|twola)//b(anners|logads)/
falkag.netRealistically, reducing the ad load by 90% should be more than sufficient for anyone. Chasing that last 10% — and whitelisting the collateral damage — will always be a losing battle. Your time is better used reading the content that is on the page you requested in the first place.
RSS feed for comments on this post. TrackBack URI
Here's a clip from the This American Life TV show about a hot dog joint in Chicago called The Wieners Circle. On weekend nights after the bars close, the staff and drunken patrons yell verbal abuse at one another like prison inmates or Jerry Springer's guests.
This, this free-for-all has doubled their business, Larry and Barry figure. They end up seeing a side of people that, honestly, changes how you feel about everybody. You really wish you never saw it.
There are several other Wieners Circle videos on YouTube, including one where a customer orders a chocolate shake, throws down $40, and one of the workers begins to take her shirt off. (via delicious ghost)
(link)According to a boat name database, here are the top 15 boat names:
Orion
Zephyr
Stargazer
Free Spirit
TBD
Cheers
Mariah
Solitude
Sandpiper
Calypso
Banana Wind
MoonDance
PATRIOT
Mental Floss
valhalla
The internet is an excellent machine for revealing ignorance. Until a few hours ago, I didn't know that the Romani people (also commonly referred to as Gypsies) are a distinct ethnic group that originated in India about a millennia ago. I had always assumed that being a Gypsy was more of a religious or cultural thing.
(link)The second in an unplanned series of posts about the pitfalls of an elite education: John Summers on teaching the banal and privileged at Harvard.
In the first meeting of my first seminar of my first year, Kushner's son Jared entered my classroom and promptly took the seat across from mine, sharing the room, so to speak. I was drawing an annual salary of $15,500 (£7,700) and borrowing the remainder for survival in Cambridge, in order that he might be given the best possible education. Jared later purchased The New York Observer for $10 million, part of which he made buying and selling real estate while also attending my seminar. As publisher, one of his first moves was to reduce pay for the Observer's stable of book reviewers. I had been writing reviews for the Observer in an effort to pay my debts.
From earlier in the week: The Disadvantages of an Elite Education. Also relevant here is the growing discussion of gigantic college endowments and how best to use them.
(link)Too Weird for The Wire, a story of a number of Baltimore drug dealers and their unusual "flesh-and-blood" defense in federal court. It's a tactic used by white supremacists and other US isolationists groups in tax evasion cases and the like.
"I am not a defendant," Mitchell declared. "I do not have attorneys." The court "lacks territorial jurisdiction over me," he argued, to the amazement of his lawyers. To support these contentions, he cited decades-old acts of Congress involving the abandonment of the gold standard and the creation of the Federal Reserve. Judge Davis, a Baltimore-born African American in his late fifties, tried to interrupt. "I object," Mitchell repeated robotically. Shelly Martin and Shelton Harris followed Mitchell to the microphone, giving the same speech verbatim. Their attorneys tried to intervene, but when Harris's lawyer leaned over to speak to him, Harris shoved him away.
David Simon, I believe you've got enough here for a sixth season of The Wire. Hop to.
(link)Constructing new LEED-certified green buildings is all well and good, but if they're further from your workers' homes and you have to tear down perfectly good old buildings to do so, the hoped-for energy savings are wasted.
Embodied energy. Another term unlovely to the ear, it's one with which preservationists need to get comfortable. In two words, it neatly encapsulates a persuasive rationale for sustaining old buildings rather than building from scratch. When people talk about energy use and buildings, they invariably mean operating energy: how much energy a building -- whether new or old -- will use from today forward for heating, cooling, and illumination. Starting at this point of analysis -- the present -- new will often trump old. But the analysis takes into account neither the energy that's already bound up in preexisting buildings nor the energy used to construct a new green building instead of reusing an old one. "Old buildings are a fossil fuel repository," as Jackson put it, "places where we've saved energy."
If embodied energy is taken into consideration, a new building that's replaced an older building will take up to 65 years to start saving energy...and those buildings aren't really designed to last that long.
(link)If physical theories were women.
Quantum mechanics is the girl you meet at the poetry reading. Everyone thinks she's really interesting and people you don't know are obsessed about her. You go out. It turns out that she's pretty complicated and has some issues. Later, after you've broken up, you wonder if her aura of mystery is actually just confusion.
Would like to see the list for men as well. (via snarkmarket)
(link)A map of the world as reported by the New York Times. Countries are color coded by the amount of times they are mentioned in the Times, per capita. Greenland, Iraq, New Zealand, Iceland, and Panama are disproportionally represented.
(link)Seed Magazine has posted Noah Kalina's photos of science labs at night. The Salk Institute is represented of course.
(link)Fonts personified at a font conference.
Pencil, telephone, hourglass, diamonds, candle, candle, flag. Mouse, scissors, ball, mailbox, mailbox, mailbox!
That's Wingdings talking.
(link)A collection of photos of things from around the world that cost $5.
To explore the relative value of five dollars we are collecting examples from around the world by asking people to submit photos of objects or services that cost the equivalent of $5.
(via clusterflock)
(link)Links provided by kottke.org.
Funny how you measure the “dangerousness” of a list by the number of filters. Any reason why you would say that http://adblock.free.fr/adblock.txt is less dangerous? It has less filters but those filters are so complex that hardly anybody can tell what they block.
Easylist goes by the recommendations for Adblock Plus - use specific filters and avoid regular expressions. This allows the filter list to be processed very fast. And the whitelisting entries are mostly due to the fact that some sites started to serve regular content through known advertising sites.
Have a look at the Filterset.G whitelist (http://pierceive.com/filtersetg/whitelist-beta/) - now that’s scary…
Comment by Wladimir Palant — January 15, 2007 @ 3:14 pm
“Dangerous”? That’s a bit harsh don’t you think?
The EasyList is fairly aggressive because users want it that way … and “yes” there will be an occasional ‘burp’ in what it blocks. But considering the amount of users the EasyList has, problems have been very minimal at best and false-positives have been addressed NOT mainly through whitelisting, but rather through a rewrite of the filtering strings.
You wrote:
“Evidence that the creators know of its greedy nature is their inclusion of a 20+ item whitelist to manually compensate what was initially blocked.”
The irony of your statement is that your first proposed filter string:
/(\b|_)ad(x|s?)(\b|_)/
…is EXACTLY why about 80% of those whitelist strings exist. Most of the whitelistings are for video players served thru an “ad” string. The whitelists allow the player to function correctly on some very MAJOR sites without having to remove the broader generic filter strings like */ads/* or *//ads.*. And I don’t have to whitelist the ENTIRE page. The EasyList works quite well this way. Try watching news video at FoxNews, MSN, Forbes, etc with just that one filter string that you proposed …. you will have a whitelist larger than mine with just that one string.
You wrote:
” … resulting in occasional broken pages and/or pages that dead-end because adblock has removed the “Next” button.”
I don’t know what ‘next’ button you are talking about. Things like this could occasionally happen, but I currently have no reports of any big problem with things like that … and if someone did have a problem, I would hope that they would bring it to my attention. These things are usually fixed as fast as I can fix them when they occur. Trying to keep pages free of ads without interrupting a user’s surfing experience is no small task … but I love doing it and devote a lot of my time to it. :-)
ps: Adblock Plus does NOT have a problem with large filter lists as long as they follow the simple expression ’shortcut’ rules. So using string totals is irrelevant to ABP’s operation. I increased the filter size because it does not take any noticeable performance hit.
Sincerely:
rick752 - ABP EasyList/EasyElement author.
Comment by rick752 — January 15, 2007 @ 5:18 pm
I’m not sure who you’re writing for. Are you targetting end-users or filter subscription maintainers? If the former, then they’re not spending time figuring out the best way to create filters. If the latter, then EasyList is the way it is because it’s optimized for the way Adblock Plus works. You can read all about it on adblockplus.org where excellent documentation is maintained and Wladimir explains in his blog which filters work best. I think you’re still used to Adblock’s filter style where regular expressions are preferred and people try to cram as many rules as they can in one expression. That is bad form for Adblock Plus because those types of filters are slower and make debugging harder.
Comment by Stupid Head — April 20, 2007 @ 1:36 pm