Temboz - interesting items
laptops razer techStaff

Because one screen was never enough.

Razer has unveiled what is essentially the mobile version of mission control with their Project Valerie laptop. The laptop opens to reveal two additional 17.3" 4K screens that slide out of the sides to create a mind boggling 11520 x 2160 resolution display. What's really slick is the mechanism, which uses aluminum hinges that deploys the screens and retracts them automatically in seconds. 

That 12K display is powered by an NVIDIA GeForce GTX 1080 GPU that uses NVIDIA Surround View to help properly display all those pixels. The "laptop" is currently in concept form, but we have a feeling Razer will be able to carve out a small, but loyal niche of gamers and serious power users looking for more screen real estate.


javascript performance web-developmentJean Vincent

This is the funniest and informative performance article ever written.

This is the funniest and informative performance article ever written. was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.

inner-sunset outer-sunset parksideWill Callan
Inner Sunset, Outer Sunset, Parkside
Img 3280

Photos: Will Callan/Hoodline

Robert Guerra is a family man. He co-owns Guerra Quality Meats (490 Taraval St.) with his brother, Paul, and his cousin, John, and has worked full-time in the multi-generational butcher shop since graduating high school in ’91. He greets his customers like kin, with embraces and Italian phrases. When children follow their parents into the market, Robert asks when they can don aprons and start working.

“Hopefully the next generation will want to take it over,” Robert says.

Part of the effort to ensure Guerra’s longevity will open next month. Guerra’s To Go, located up the block at 345 Taraval, is scheduled to open on Feb. 6th, after three months of renovations to the former Supreme Cleaners, a dry cleaning hub in an old Freemason building. 

Robert Guerra in front of the soon-to-be-unveiled Guerra's To Go.

With the new location, the Guerras hope to provide familiar lunch and dinner options to the families and Sunset residents who rely on their meat selection and cooking tips.

Robert says that busy parents looking to optimize dinnertime, and customers whose enthusiasm for cooking has waned, will be attracted by the new shop’s convenience and affordability.

“Some people are paying 800-900 bucks a week, just going out to dinner,” he says. “It’s a mortgage. You can buy a house for that.”

The menu will include beef bourguignon, handmade lasagnas, an entire barbecue section, and, for dessert, panettone bread pudding and Stemma gelato, made by Robert’s wife, Nicole. 

Guerra’s has been a Sunset staple since 1954, when Robert’s father and uncle—recently arrived from Lucca, Italy—opened the business and entered into competition with the nine other butcher shops that then punctuated Taraval’s stretch to the ocean. 

Deli line at the original Guerra's location.

Today, Guerra’s is one of the few remaining butcher shops in the Sunset. With polished concrete floors and neon signage, Guerra’s To Go will evoke the decade in which Guerra's got its start. 

“People like things that have been in their neighborhood for a long time,” Robert says. “It’s going to be an extension of what we already have.”

Once it debuts Feb. 6th, Guerra’s To Go will be open daily from 10:30am-8:30pm.

<img src="http://feedpress.me/10252/5086670.gif" height="1" width="1"/>
▼ Time, but Faster
ACM QueueACM Queue 15 hours ago (no tags)
The first premise was summed up perfectly by the late Douglas Adams in The Hitchhiker's Guide to the Galaxy: "Time is an illusion. Lunchtime doubly so." The concept of time, when colliding with decoupled networks of computers that run at billions of operations per second, is... well, the truth of the matter is that you simply never really know what time it is. That is why Leslie Lamport's seminal paper on Lamport timestamps was so important to the industry, but this article is actually about wall-clock time, or a reasonably useful estimation of it.
av news
HDMI Forum, Inc announced the upcoming release of Version 2.1 of the HDMI specification that supports higher video resolutions and refresh rates including 8K60 and 4K120, Dynamic HDR, and increased bandwidth with a new 48G cable. Backwards compatibility is supported with earlier versions of the HDMI specification. The new spec will be available to all HDMI 2.0 Adopters by Q2 2017. See what other benefits HDMI 2.0 provides by reading on...
Thomas Claburn

With a transcompiling tool, YouTube aims to overcome Python's limitations

Google on Wednesday introduced an open-source project called Grumpy to translate Python code into Go programs.…

Shaun Nichols

Renters freed from restrictions on network providers

San Francisco has become the first major US city to bar building owners from restricting their tenants to specific ISPs.…

techJake Crump

“ Very cool home security system. Read more about it here: http://mashable.com/2017/01/03/aura-home-security-ces/?utm_cid=mash-com-Tw-main-link#bO69kQ8EvgqW ”
– Jake Crump

Discussion | Link

Rik Henderson

LG used its CES 2017 press conference to unveil a new range of OLED TVs with new high dynamic range (HDR) standards support and Dolby Atmos audio, but the star of the show was undoubtedly the LG Signature OLED W - a TV so thin the company tags it as a...

culture kids lego lego boost news play robotics stem toysLee Mathews

You can build some pretty amazing things with Lego Mindstorms EV3… provided you’re comfortable shelling out more than $300. Lego’s new robotics kit lets you get building for a lot less money. Just […]

The post Lego Boost Is a More Wallet-Friendly Robotics Kit appeared first on Geek.com.


I think this falls somewhere a little short of common knowledge, but obvious once you know it. It lets machines roam in and out of the network without too much config fiddling. Instead, we configure machines to always use “cloud” services but intercept the packets to provide local services.

Here’s the pf.conf rules I have on my router.

pass in on cnmac1 proto { udp , tcp } from any to any port domain rdr-to port domain pass in on cnmac1 proto { udp , tcp } from any to any port ntp rdr-to port ntp

This steals any DNS or NTP traffic bound for the internet and redirects it back to the local machine, servicing it locally.

Normally one gets a DNS server via DHCP, but I usually prefer to use So I override that option in dhclient.conf. Works great outside the house. But when I’m home, then I really do want to use the local server because that’s the one that knows about other hostnames on the network. This lets me keep a hardcoded config on my laptop and fix it at the router.

Similarly with NTP, although the situation is a little different since we don’t usually get that from the DHCP server. Instead it’s configured once. I could use the ntp.org server pool, but it’s silly to have a half dozen machines each probing several upstream servers. For a while I used a config that pointed at the router directly, but then when I take a laptop on the road, it can’t sync time at all. Solution: point everything at time.google.com in ntpd.conf, and again have the router fix it up. (Bonus benefit: Windows and Apple machines will also now use the router’s time service with no config fiddling either.)

In short, permanently configure laptops for mobile use, and then configure the router to provide optimized services. This is typically easier than trying to configure the laptop to detect which network it’s using.

Richard Chirgwin

How to spot a side order of Rowhammer in a benign binary

Rowhammer and similar side-channel attacks aren't caught by anti-virus, so a bunch of US boffins have set about working out how to catch their signatures.…

Sean O'Kane

Thermal imaging company Flir has announced a few new toys and tools that will let users see in the dark, measure heat, and help with home improvements.

The first is a new version of the Flir One, a thermal camera attachment for iPhones and Android phones (equipped with USB Type-C ports). The new Flir One features a new image processor and higher resolution visual camera, and is cheaper than its 2015 predecessor — just $199 instead of the old price of $250. It also has a height-adjustable dongle, meaning you can now use it with phone cases, something that previously required flimsy adapters.

Flir says that it saw a number of customers use the previous Flir One in professional or semi-professional settings, such as plumbing, contracting,...

Continue reading…

Justin Krajeski
Various card readers in a pile on a wooden surface.

After spending eight hours researching and testing 12 card readers, we found that the IOGear USB-C 3-Slot Card Reader is the best option for anyone who needs an SD card reader for a new laptop with USB-C ports. The IOGear delivered fast, consistent speeds, and supports SD, microSD, and CF cards.

maine religion us news world newsAssociated Press in Portland, Maine

One of the last members of a nearly extinct religious society at Sabbathday Lake has died, a loss for a group that’s dwindled because members are celibate

Sister Frances Carr, one of the last remaining members of the nearly extinct religious society called the Shakers, has died. She was 89.

Carr died Monday surrounded by family and friends in the dwelling house at the Shaker community at Sabbathday Lake in New Gloucester, Maine after a brief battle with cancer, said Brother Arnold Hadd, one of the group’s two remaining members.

Continue reading...
Thomas Claburn

Advertising industry consultant argues banners are undermined by fees and fraud

One dollar of online display advertising will buy you approximately $0.03 worth of actual ads seen by real people, according to Bob Hoffman, a partner in media consultancy Type A Group.…

linked pdf previewJohn Voorhees

After macOS Sierra was released, reports of problems with PDFs created with Fujitsu’s ScanSnap scanner surfaced. Apple resolved those problems with the release of macOS 10.12.1, but it turns out the problems with PDFs on Sierra run deeper.

Adam Engst of TidBITS has a rundown of several issues that plague Preview, Apple’s PDF app, and many third-party PDF apps. The source of the problems seems to be PDFKit, a developer framework for handling PDFs in macOS. According to developers who spoke to Engst, Apple rewrote parts of PDFKit to unify the macOS and iOS PDF code bases. In the process, developers say that Apple introduced a series of significant bugs and deprecated PDFKit features that broke third-party apps that use PDFKit.

Most recently, the macOS 10.12.2 release seems to have introduced a Preview bug that deletes any OCR layer embedded in a PDF that is edited in Preview. Meanwhile, third-party developers have run into new bugs that affect the handling of PDF annotations.

Engst, the co-author of Take Control of Preview, concludes that:

… I have to recommend that Sierra users avoid using Preview to edit PDF documents until Apple fixes these bugs. If editing a PDF in Preview in unavoidable, be sure to work only on a copy of the file and retain the original in case editing introduces corruption of any sort. Smile’s PDFpen [which doesn’t use PDFKit] is the obvious alternative for PDF manipulation of all sorts (and for documentation, we have “Take Control of PDFpen 8” too), although Adobe’s Acrobat DC is also an option, albeit an expensive one.

→ Source: tidbits.com

Richard Chirgwin

Where were you in June 1995? Coding image libraries? Let's have a chat

Slackware has raced out of the blocks in 2017, issuing one patch for the libpng image library on New Year's Day, and two Mozilla patches on January 2.…

elsewhere goFilippo Valsorda

I was asked to contribute a post to the excellent Gopher Academy advent series. I took the occasion to write down what I learned deploying a Go service on the Cloudflare edge.

The result is a catalogue of what you need to know before you drop NGINX from in front of your Go server.

The net/http part is a bit of a follow up to my timeouts post, but with more links to Go issues (tl;dr: use Go 1.8). There's also a section summarizing how to use crypto/tls securely and performantly.

Server Timeouts

So you want to expose Go on the Internet | Gopher Academy Blog (archive) (republished on the Cloudflare blog)

business post potus ripoffs white houseCory Doctorow

In The Competition Initiative and Hidden Fees, the White House's National Economic Council documents the widespread use of deceptive "service charges" that businesses levy, allowing them to advertise prices that are wildly divergent from what you'll actually pay -- think of the $30, unavoidable "resort fees" added to a hotel bill; the $25 "processing fees" added to concert tickets, the random fees added to telecom bills, etc, all adding up to billions transferred away from American shoppers to big business. (more…)

Openssl is prone to a denial of service (DoS) vulnerability. This allow a remote attacker to cause a denial of service (DoS) condition due to high consumption of system resources via certain vulnerable vectors.
Armin Ronacher

This should have been obvious to me for a longer time, but until earlier today I did not really realize the severity of the issues caused by str.format on untrusted user input. It came up as a way to bypass the Jinja2 Sandbox in a way that would permit retrieving information that you should not have access to which is why I just pushed out a security release for it.

However I think the general issue is quite severe and needs to be a discussed because most people are most likely not aware of how easy it is to exploit.

The Core Issue

Starting with Python 2.6 a new format string syntax landed inspired by .NET which is also the same syntax that is supported by Rust and some other programming languages. It's available behind the .format() method on byte and unicode strings (on Python 3 just on unicode strings) and it's also mirrored in the more customizable string.Formatter API.

One of the features of it is that you can address both positional and keyword arguments to the string formatting and you can explicitly reorder items at all times. However the bigger feature is that you can access attributes and items of objects. The latter is what is causing the problem here.

Essentially one can do things like the following:

>>> 'class of {0} is {0.__class__}'.format(42)
"class of 42 is <class 'int'>"

In essence: whoever controls the format string can access potentially internal attributes of objects.

Where does it Happen?

First question is why would anyone control the format string. There are a few places where it shows up:

  • untrusted translators on string files. This is a big one because many applications that are translated into multiple languages will use new-style Python string formatting and not everybody will vet all the strings that come in.
  • user exposed configuration. One some systems users might be permitted to configure some behavior and that might be exposed as format strings. In particular I have seen it where users can configure notification mails, log message formats or other basic templates in web applications.

Levels of Danger

For as long as only C interpreter objects are passed to the format string you are somewhat safe because the worst you can discover is some internal reprs like the fact that something is an integer class above.

However tricky it becomes once Python objects are passed in. The reason for this is that the amount of stuff that is exposed from Python functions is pretty crazy. Here is an example from a hypothetical web application setup that would leak the secret key:

    'SECRET_KEY': 'super secret key'

class Event(object):
    def __init__(self, id, level, message):
        self.id = id
        self.level = level
        self.message = message

def format_event(format_string, event):
    return format_string.format(event=event)

If the user can inject format_string here they could discover the secret string like this:


Sandboxing Formatting

So what do you do if you do need to let someone else provide format strings? You can use the somewhat undocumented internals to change the behavior.

from string import Formatter
from collections import Mapping

class MagicFormatMapping(Mapping):
    """This class implements a dummy wrapper to fix a bug in the Python
    standard library for string formatting.

    See http://bugs.python.org/issue13598 for information about why
    this is necessary.

    def __init__(self, args, kwargs):
        self._args = args
        self._kwargs = kwargs
        self._last_index = 0

    def __getitem__(self, key):
        if key == '':
            idx = self._last_index
            self._last_index += 1
                return self._args[idx]
            except LookupError:
            key = str(idx)
        return self._kwargs[key]

    def __iter__(self):
        return iter(self._kwargs)

    def __len__(self):
        return len(self._kwargs)

# This is a necessary API but it's undocumented and moved around
# between Python releases
    from _string import formatter_field_name_split
except ImportError:
    formatter_field_name_split = lambda \
        x: x._formatter_field_name_split()

class SafeFormatter(Formatter):

    def get_field(self, field_name, args, kwargs):
        first, rest = formatter_field_name_split(field_name)
        obj = self.get_value(first, args, kwargs)
        for is_attr, i in rest:
            if is_attr:
                obj = safe_getattr(obj, i)
                obj = obj[i]
        return obj, first

def safe_getattr(obj, attr):
    # Expand the logic here.  For instance on 2.x you will also need
    # to disallow func_globals, on 3.x you will also need to hide
    # things like cr_frame and others.  So ideally have a list of
    # objects that are entirely unsafe to access.
    if attr[:1] == '_':
        raise AttributeError(attr)
    return getattr(obj, attr)

def safe_format(_string, *args, **kwargs):
    formatter = SafeFormatter()
    kwargs = MagicFormatMapping(args, kwargs)
    return formatter.vformat(_string, args, kwargs)

Now you can use the safe_format method as a replacement for str.format:

>>> '{0.__class__}'.format(42)
"<type 'int'>"
>>> safe_format('{0.__class__}', 42)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: __class__
Deborah M Gordon

The ant colony has often served as a metaphor for human order and hierarchy. But real ant society is radical to its core

By Deborah M Gordon

Read at Aeon

canon 24-70mm f/2.8l ii canon 50mm f/1.2l canon 70-200mm f/2.8l is ii equipment geek articles nikon 24-120mm f/4g ed af-s vr photography recommendations sony fe 16-35mm f/4 za oss sony fe 24-70mm f/4 za ossRoger Cicala

The internet is an interesting resource. Once you’ve put a reference up, it’s there forever. Over time, things may change, but that 10-year-old article doesn’t. A few weeks ago someone used some old articles I’d written (12, 3) as a reason why he doesn’t use protective UV filters. They claimed I had said there was no need to use UV filters.

That’s kind of what I said, but so compressed as to be inaccurate. What I really said was this:

  1. Evaluate the cost to benefit ratio of using a UV filter. Don’t use a $100 filter to protect a $100 front element.
  2. Evaluate the situation. If you are in a high-risk environment, use a filter to maximize your protection.
  3. Never, ever use a cheap $30 filter unless you don’t care about image quality.

But at that time (2010 to 2013) I wrote those posts, that meant I didn’t use UV filters very often because it usually wasn’t worth the money. Front element replacements weren’t that expensive, and high-quality UV filters were expensive.

A Bit About My Qualifications

As with every article I write, there will be three or four comments saying, “I’ve had 20 different lenses and never had this problem.” Good for you. I take care of around 20,000 lenses at any given time. About 15,000 of those never had a problem either. You do the math. So I write about the problems that occur so you can be forewarned about the problems that may occur with your lenses. Forewarned is forearmed.

We replace front elements in lenses for cosmetic reasons every day. Every. Damn. Day. Sure, they’re rental lenses and maybe don’t get babied as much as your lenses do. You may get lucky and never deal with any of this. But it’s always good to make an informed decision about what precautions you want to take. In this particular case, basing your decisions on 6 or 8 year old data can be a mistake.

Discarded front elements. Each bin has around 125 front elements and weighs 30 to 50 pounds each. Lensrentals.com, 2016

Discarded front elements. Each bin has around 125 front elements and weighs 30 to 50 pounds. Lensrentals.com, 2016

There Has Been a Big Change

When I wrote most of those articles I talked about above, front elements cost from $70 to $200, and good filters cost $70- to $140. Today good UV filters still cost from $70 to $140, even in 82mm size. But the cost of replacing front elements has skyrocketed.

I’ve always said to evaluate the cost-benefit ratio of whether you want to use a protective filter. Now the price has changed so that change should be factored into the equation. (BTW – the costs I’m giving as examples below are our costs. Your prices may be different. But we’re a really large repair customer based in the U. S. so by ‘different’ I mean ‘yours will probably be a bit more expensive.’)

Most people aren’t surprised that a Canon 70-200 f/2.8 IS lens ($1,800 or so) has a $270 front element, or that the Canon 24-70 f/2.8 Mk II ($1,700) runs just over $200 for a new front element. You might not even be surprised that a 150-600mm telephoto lens that costs $900 or so runs over $200 to $250 to get the front element replaced (either brand) since that is a big piece of glass.

But you might get a little queasy when you find out a Nikon 24-120mm lens that costs from $900 to $1100 new (they go on sale a lot) costs $320 to replace the front element. If you own a Zeiss Otus lens, you probably have reasonable funds in your bank account, but you may still have sticker shock if you need a front element replaced; it will cost $900 to $1,500. Some Sony FE lenses require replacement of the entire optical group ($800 to $1,500) to replace a front element scratch because the front element isn’t available as a part; the entire optical assembly is the only part. (Sony is working on separating the front element part to reduce this cost. They’ve already done so on the FE 24-70mm f/4; now a front element replacement is only $270. They say they are doing so with most of their other lenses, too, but right now an FE 16-35 f/4 front glass replacement is $785, for example. We have bins full of Sony lenses that just aren’t worth the cost of front element replacements, hoping that some day the price will become reasonable.)

More important, perhaps, is that the cost of a front element replacement can be very high even for a not-so-expensive lens. A decade ago you could assume a relatively inexpensive lens (under $1,000) had a relatively inexpensive front element. A Canon 70-400 f/4 IS lens, for example, cost around $950 and a front element replacement was under $100. Most newly released under-$1,000 lenses will run $200-$300 for a front element replacement.

My point here is not to provide a list of what every manufacturer charges for every front element. You can call or email them if you want to know. It’s not that one manufacturer has really high prices and another doesn’t. They all have some high and low-cost lenses. For example, manufacturer X has front element replacement costs that range from 9% of the lens price to 33%. If there’s a general rule it’s that lenses released 5 or more years ago are reasonable cost for front element replacement, newly released lenses are expensive. There are exceptions, of course. And so I’m clear, it’s not that the cost goes down over time. Unlike sale prices, repair prices don’t go down after the lens has been out a while.

So, front element replacement costs, particularly on new lenses, are higher than they used to be and some lenses are breathtakingly higher. There are some real reasons this is so. These new, sharper lenses often have front elements that are unique glass types, nano-coated, highly aspherical and a lot of other things. Manufacturers are pulling ever trick out of their bag to make lenses sharper and better. One of those tricks is that front elements, which often used to be simple, protective elements, got fancy. ‘Fancy’ is the optical word for damned expensive.

Some of the increased expense is just poor planning. Let’s pretend, for example; the manufacturer thought, “Well, this front group is so difficult to adjust optically that the service center can’t do it. So instead of making the front element a part, let’s just make the entire front group of lens elements a part so that we can adjust it in the factory, and the service center can just replace the whole group.” I have talked to manufacturer’s lens designers who had absolutely no idea the front element was more likely to need replacing than other parts (we replace more front elements than all other parts combined). That’s not what they do; they design lenses.

Does This Really Matter?

Small front element scratches, the kind that filters will protect against, rarely affect image quality at all. Once in a while, shooting into bright lights, you may (or may not) get some flare from the scratch. So if your lens gets some little front element scratches, as most eventually do, it matters very little.

But if you plan on selling your lens at some point, it does affect resale value significantly. We make this decision every day: The lens has a small scratch on the front group, is it worth the price of replacing it? A few years ago we’d spend $100 on a new front element and get $150 more in asking price for the used lens. These days the math is different. It’s not worth spending $250 on a new front group to get a $150 higher selling price.

Of course, there’s a positive for people who buy used lenses. A small front element scratch means a bargain lens for sale that will function just fine.

So does this mean you should put a UV filter on all of your lenses? No. I still recommend looking at the cost-to-benefit ratio for the specific lens, and consideration of what you do with it.

In most cases, a lens hood (actually mounted on the lens, not left in your bag) provides plenty of front element protection. Wide-angle lenses are a unique case, though, since those hoods don’t provide much protection. But if you are outdoors and your lens is exposed to dust, sand, water, or other things lenses don’t like, then a hood doesn’t provide protection and UV filter is probably worthwhile.

How Much Does A Filter Impact Image

Well, if you buy a $30 filter, then it can impact the image a lot. Waviness in the thickness of glass, poor coatings, poor quality glass, even shiny metal in the mounting ring can cause problems. If you choose to buy a cheap filter, you’ll probably see effects if you look critically; although it won’t be in every shot. You may see some effect on absolute sharpness, but you’re far more likely to see effects from light flare, ugly bokeh, ghosting, and reflections.

A high-quality filter is made from good optical glass, flat to within 1/4 wavelength, and multicoated on BOTH SIDES. It’s expensive, but it doesn’t have much effect on image quality at all. (When you do your filter shopping, make sure the filter is coated on both sides; some cheap filter makers multicoat one side only, then advertise it as multicoated). A good filter should avoid most (not all, but almost all) effects regarding ghosting, flare, and reflection. It shouldn’t affect sharpness even at the highest level of measurement.

Here’s an example of a Canon 50mm f/1.2 lens tested on our optical bench to demonstrate there’s no sharpness penalty with an excellent filter. First the lens with no filter at all.

Olaf Optical Testing, 2016

Olaf Optical Testing, 2016

And then tested with a high-quality UV filter in place.

Olaf Optical Testing, 2016

Olaf Optical Testing, 2016

There are no effects on MTF from the filter, either on or off axis. This doesn’t mean there might not be a bit of ghosting if you’re shooting street lights at night or star trails, etc. But it won’t be often, and you can always remove the filter if you notice that happening in a particular shot.

So Do I Recommend Filters Now?

Not necessarily, no. I support common sense and looking at the cost-to-benefit ratio. But if you have one of the newer lenses, especially if you know the front element has Nano-coating, is aspherical, or is just expensive to replace, then I’d certainly at least have a filter available. I’d especially recommend it if you expect you might sell that lens someday; even a minor scratch is probably a 10% price reduction on the retail market. If you say they’ll pry that lens out of your cold, dead fingers, then I wouldn’t worry so much.

I also really recommend you look carefully at the filter threads and front element BEFORE you mount a filter on the lens, especially if it’s an ‘ultra-thin’ filter. Several lenses we know of have slightly projecting front elements, and some ultra-thin filters can actually touch the center of the front element causing a scratch if the filter is tightened all the way down.

While I’m on the subject of touching the front element, though, those nice new thick lens caps a lot of manufacturers use these days, the ones with the big spring-loaded squeeze release handles, are a problem for some lenses, especially when they are 82mm wide. There’s a lot of plastic underneath that cap, right over the center of the lens. If the lens cap gets pushed down in the middle, it can scrape the front element. I know: glass is harder than plastic. But coatings aren’t. And a coating scratch is just as visible as a glass scratch.

Roger Cicala


December, 2016

algorithms benchmarks concurrency cs-tries ctries data structures hashmaps inverted bitmaps lock-freedom message queues message-oriented middleware messaging performance triesTyler Treat

A common problem in messaging middleware is that of efficiently matching message topics with interested subscribers. For example, assume we have a set of subscribers, numbered 1 to 3:

Subscriber Match Request
1 forex.usd
2 forex.*
3 stock.nasdaq.msft

And we have a stream of messages, numbered 1 to N:

Message Topic
1 forex.gbp
2 stock.nyse.ibm
3 stock.nyse.ge
4 forex.eur
5 forex.usd
N stock.nasdaq.msft

We are then tasked with routing messages whose topics match the respective subscriber requests, where a “*” wildcard matches any word. This is frequently a bottleneck for message-oriented middleware like ZeroMQ, RabbitMQ, ActiveMQ, TIBCO EMS, et al. Because of this, there are a number of well-known solutions to the problem. In this post, I’ll describe some of these solutions, as well as a novel one, and attempt to quantify them through benchmarking. As usual, the code is available on GitHub.

The Naive Solution

The naive solution is pretty simple: use a hashmap that maps topics to subscribers. Subscribing involves adding a new entry to the map (or appending to a list if it already exists). Matching a message to subscribers involves scanning through every entry in the map, checking if the match request matches the message topic, and returning the subscribers for those that do.

Inserts are approximately O(1) and lookups approximately O(n*m) where n is the number of subscriptions and m is the number of words in a topic. This means the performance of this solution is heavily dependent upon how many subscriptions exist in the map and also the access patterns (rate of reads vs. writes). Since most use cases are heavily biased towards searches rather than updates, the naive solution—unsurprisingly—is not a great option.

The microbenchmark below compares the performance of subscribe, unsubscribe, and lookup (matching) operations, first using an empty hashmap (what we call cold) and then with one containing 1,000 randomly generated 5-word topic subscriptions (what we call hot). With the populated subscription map, lookups are about three orders of magnitude slower, which is why we have to use a log scale in the chart below.

subscribe unsubscribe lookup
cold 172ns 51.2ns 787ns
hot 221ns 55ns 815,787ns

Inverted Bitmap

The inverted bitmap technique builds on the observation that lookups are more frequent than updates and assumes that the search space is finite. Consequently, it shifts some of the cost from the read path to the write path. It works by storing a set of bitmaps, one per topic, or criteria, in the search space. Subscriptions are then assigned an increasing number starting at 0. We analyze each subscription to determine the matching criteria and set the corresponding bits in the criteria bitmaps to 1. For example, assume our search space consists of the following set of topics:

  • forex.usd
  • forex.gbp
  • forex.jpy
  • forex.eur
  • stock.nasdaq
  • stock.nyse

We then have the following subscriptions:

  • 0 = forex.* (matches forex.usd, forex.gbp, forex.jpy, and forex.eur)
  • 1 = stock.nyse (matches stock.nyse)
  • 2 = *.* (matches everything)
  • 3 = stock.* (matches stock.nasdaq and stock.nyse)

When we index the subscriptions above, we get the following set of bitmaps:

 Criteria 0 1 2 3
forex.usd 1 0 1 0
forex.gbp 1 0 1 0
forex.jpy 1 0 1 0
forex.eur 1 0 1 0
stock.nasdaq 0 0 1 1
stock.nyse 0 1 1 1

When we match a message, we simply need to lookup the corresponding bitmap and check the set bits. As we see below, subscribe and unsubscribe are quite expensive with respect to the naive solution, but lookups now fall well below half a microsecond, which is pretty good (the fact that the chart below doesn’t use a log scale like the one above should be an indictment of the naive hashmap-based solution).

subscribe unsubscribe lookup
cold 3,795ns 198ns 380ns
hot 3,863ns 198ns 395ns

The inverted bitmap is a better option than the hashmap when we have a read-heavy workload. One limitation is it requires us to know the search space ahead of time or otherwise requires reindexing which, frankly, is prohibitively expensive.

Optimized Inverted Bitmap

The inverted bitmap technique works well enough, but only if the topic space is fairly static. It also falls over pretty quickly when the topic space and number of subscriptions are large, say, millions of topics and thousands of subscribers. The main benefit of topic-based routing is it allows for faster matching algorithms in contrast to content-based routing, which can be exponentially slower. The truth is, to be useful, your topics probably consist of stock.nyse.ibm, stock.nyse.ge, stock.nasdaq.msft, stock.nasdaq.aapl, etc., not stock.nyse and stock.nasdaq. We could end up with an explosion of topics and, even with efficient bitmaps, the memory consumption tends to be too high despite the fact that most of the bitmaps are quite sparse.

Fortunately, we can reduce the amount of memory we consume using a fairly straightforward optimization. Rather than requiring the entire search space a priori, we simply require the max topic size, in terms of words, e.g. stock.nyse.ibm has a size of 3. We can handle topics of the max size or less, e.g. stock.nyse.bac, stock.nasdaq.txn, forex.usd, index, etc. If we see a message with more words than the max, we can safely assume there are no matching subscriptions.

The optimized inverted bitmap works by splitting topics into their constituent parts. Each constituent position has a set of bitmaps, and we use a technique similar to the one described above on each part. We end up with a bitmap for each constituent which we perform a logical AND on to give a resulting bitmap. Each 1 in the resulting bitmap corresponds to a subscription. This means if the max topic size is n, we only AND at most n bitmaps. Furthermore, if we come across any empty bitmaps, we can stop early since we know there are no matching subscribers.

Let’s say our max topic size is 2 and we have the following subscriptions:

  • 0 = forex.*
  • 1 = stock.nyse
  • 2 = index
  • 3 = stock.*

The inverted bitmap for the first constituent looks like the following:

forex.* stock.nyse index stock.*
null 0 0 0 0
forex 1 0 0 0
stock 0 1 0 1
index 0 0 1 0
other 0 0 0 0

And the second constituent bitmap:

forex.* stock.nyse index stock.*
null 0 0 1 0
nyse 0 1 0 0
other 1 0 0 1

The “null” and “other” rows are worth pointing out. “Null” simply means the topic has no corresponding constituent.  For example, “index” has no second constituent, so “null” is marked. “Other” allows us to limit the number of rows needed such that we only need the ones that appear in subscriptions.  For example, if messages are published on forex.eur, forex.usd, and forex.gbp but I merely subscribe to forex.*, there’s no need to index eur, usd, or gbp. Instead, we just mark the “other” row which will match all of them.

Let’s look at an example using the above bitmaps. Imagine we want to route a message published on forex.eur. We split the topic into its constituents: “forex” and “eur.” We get the row corresponding to “forex” from the first constituent bitmap, the one corresponding to “eur” from the second (other), and then AND the rows.

forex.* stock.nyse index stock.*
1 = forex 1 0 0 0
2 = other 1 0 0 1
AND 1 0 0 0

The forex.* subscription matches.

Let’s try one more example: a message published on stock.nyse.

forex.* stock.nyse index stock.*
1 = stock 0 1 0 1
2 = nyse 0 1 0 1
AND 0 1 0 1

In this case, we also need to OR the “other” row for the second constituent. This gives us a match for stock.nyse and stock.*.

Subscribe operations are significantly faster with the space-optimized inverted bitmap compared to the normal inverted bitmap, but lookups are much slower. However, the optimized version consumes roughly 4.5x less memory for every subscription. The increased flexibility and improved scalability makes the optimized version a better choice for all but the very latency-sensitive use cases.

subscribe unsubscribe lookup
cold 1,053ns 330ns 2,724ns
hot 1,076ns 371ns 3,337ns


The optimized inverted bitmap improves space complexity, but it does so at the cost of lookup efficiency. Is there a way we can reconcile both time and space complexity? While inverted bitmaps allow for efficient lookups, they are quite wasteful for sparse sets, even when using highly compressed bitmaps like Roaring bitmaps.

Tries can often be more space efficient in these circumstances. When we add a subscription, we descend the trie, adding nodes along the way as necessary, until we run out of words in the topic. Finally, we add some metadata containing the subscription information to the last node in the chain. To match a message topic, we perform a similar traversal. If a node doesn’t exist in the chain, we know there are no subscribers. One downside of this method is, in order to support wildcards, we must backtrack on a literal match and check the “*” branch as well.

For the given set of subscriptions, the trie would look something like the following:

  • forex.*
  • stock.nyse
  • index
  • stock.*

You might be tempted to ask: “why do we even need the “*” nodes? When someone subscribes to stock.*, just follow all branches after “stock” and add the subscriber.” This would indeed move the backtracking cost from the read path to the write path, but—like the first inverted bitmap we looked at—it only works if the search space is known ahead of time. It would also largely negate the memory-usage benefits we’re looking for since it would require pre-indexing all topics while requiring a finite search space.

It turns out, this trie technique is how systems like ZeroMQ and RabbitMQ implement their topic matching due to its balance between space and time complexity and overall performance predictability.

subscribe unsubscribe lookup
cold 406ns 221ns 2,145ns
hot 443ns 257ns 2,278ns

We can see that, compared to the optimized inverted bitmap, the trie performs much more predictably with relation to the number of subscriptions held.

Concurrent Subscription Trie

One thing we haven’t paid much attention to so far is concurrency. Indeed, message-oriented middleware is typically highly concurrent since they have to deal with heavy IO (reading messages from the wire, writing messages to the wire, reading messages from disk, writing messages to disk, etc.) and CPU operations (like topic matching and routing). Subscribe, unsubscribe, and lookups are usually all happening in different threads of execution. This is especially important when we want to talk advantage of multi-core processors.

It wasn’t shown, but all of the preceding algorithms used global locks to ensure thread safety between read and write operations, making the data structures safe for concurrent use. However, the microbenchmarks don’t really show the impact of this, which we will see momentarily.

Lock-freedom, which I’ve written about, allows us to increase throughput at the expense of increased tail latency.

Lock-free concurrency means that while a particular thread of execution may be blocked, all CPUs are able to continue processing other work. For example, imagine a program that protects access to some resource using a mutex. If a thread acquires this mutex and is subsequently preempted, no other thread can proceed until this thread is rescheduled by the OS. If the scheduler is adversarial, it may never resume execution of the thread, and the program would be effectively deadlocked. A key point, however, is that the mere lack of a lock does not guarantee a program is lock-free. In this context, “lock” really refers to deadlock, livelock, or the misdeeds of a malevolent scheduler.

The concurrent subscription trie, or CS-trie,  is a new take on the trie-based solution described earlier. Detailed here, it combines the idea of the topic-matching trie with that of a Ctrie, or concurrent trie, which is a non-blocking concurrent hash trie.

The fundamental problem with the trie, as it relates to concurrency, is it requires a global lock, which severely limits throughput. To address this, the CS-trie uses indirection nodes, or I-nodes, which remain present in the trie even as the nodes above and below change. Subscriptions are then added or removed by creating a copy of the respective node, and performing a CAS on its parent I-node. This allows us to add, remove, and lookup subscriptions concurrently and in a lock-free, linearizable manner.

For the given set of subscribers, labeled x, y, and z, the CS-trie would look something like the following:

  • x = foo, bar, bar.baz
  • y = foo, bar.qux
  • z = bar.*

Lookups on the CS-trie perform, on average, better than the standard trie, and the CS-trie scales better with respect to concurrent operations.

subscribe unsubscribe lookup
cold 412ns 245ns 1,615ns
hot 471ns 280ns 1,637ns

Latency Comparison

The chart below shows the topic-matching operation latencies for all of the algorithms side-by-side. First, we look at the performance of a cold start (no subscriptions) and then the performance of a hot start (1,000 subscriptions).

Throughput Comparison

So far, we’ve looked at the latency of individual topic-matching operations. Next, we look at overall throughput of each of the algorithms and their memory footprint.

 algorithm msg/sec
naive  4,053.48
inverted bitmap  1,052,315.02
optimized inverted bitmap  130,705.98
trie  248,762.10
cs-trie  340,910.64

On the surface, the inverted bitmap looks like the clear winner, clocking in at over 1 million matches per second. However, we know the inverted bitmap does not scale and, indeed, this becomes clear when we look at memory consumption, underscored by the fact that the below chart uses a log scale.

Scalability with Respect to Concurrency

Lastly, we’ll look at how each of these algorithms scales with respect to concurrency. We do this by performing concurrent operations and varying the level of concurrency and number of operations. We start with a 50-50 split between reads and writes. We vary the number of goroutines from 2 to 16 (the benchmark was run using a 2.6 GHz Intel Core i7 processor with 8 logical cores). Each goroutine performs 1,000 reads or 1,000 writes. For example, the 2-goroutine benchmark performs 1,000 reads and 1,000 writes, the 4-goroutine benchmark performs 2,000 reads and 2,000 writes, etc. We then measure the total amount of time needed to complete the workload.

We can see that the tries hardly even register on the scale above, so we’ll plot them separately.

The tries are clearly much more efficient than the other solutions, but the CS-trie in particular scales well to the increased workload and concurrency.

Since most workloads are heavily biased towards reads over writes, we’ll run a separate benchmark that uses a 90-10 split reads and writes. This should hopefully provide a more realistic result.

The results look, more or less, like what we would expect, with the reduced writes improving the inverted bitmap performance. The CS-trie still scales quite well in comparison to the global-lock trie.


As we’ve seen, there are several approaches to consider to implement fast topic matching. There are also several aspects to look at: read/write access patterns, time complexity, space complexity, throughput, and latency.

The naive hashmap solution is generally a poor choice due to its prohibitively expensive lookup time. Inverted bitmaps offer a better solution. The standard implementation is reasonable if the search space is finite, small, and known a priori, especially if read latency is critical. The space-optimized version is a better choice for scalability, offering a good balance between read and write performance while keeping a small memory footprint. The trie is an even better choice, providing lower latency than the optimized inverted bitmap and consuming less memory. It’s particularly good if the subscription tree is sparse and topics are not known a priori. Lastly, the concurrent subscription trie is the best option if there is high concurrency and throughput matters. It offers similar performance to the trie but scales better. The only downside is an increase in implementation complexity.

Chris Matyszczyk
Commentary: The very chichi Quince offers a dish that apparently is, well, expensively plated.
disputes international law maps postRob Beschizza

A World of Disputed Territories maps all the countries in the world fighting over territority. Sometimes the disputes are quaint, even comical—visit Rockall or Hans Island!—but others are as tangled as they ever have been. Want to live somewhere undisputed? Try Svalbard!
airline industry business data and computer security flights hacking privacy technology travel world newsAlex Hern in Hamburg

Worldwide system used to coordinate travel bookings between airlines is insecure and easy to exploit, experts reveal

The worldwide system used to coordinate travel bookings between airlines, travel agents, and price comparison websites is hopelessly insecure, according to researchers.

The lack of modern security features, both in the design of the system itself and of the many sites and services that control access to it, makes it easy for an attacker to harvest personal information from bookings, steal flights by altering ticketing details, or earn millions of air miles by attaching new frequent-flyer numbers to pre-booked flights, according to German security firm SR Labs.

Continue reading...
Samit Sarkar

Trying to escape that uncanny valley

Continue reading…

Chris Matyszczyk
Commentary: An ad for France's Monoprix suggests that shopping on Amazon isn't so very convenient.
INFO LE FIGARO - Après une année 2015 marquée par la hausse des actes antisémites et antimusulmans, 2016 s'annonce comme une année d'accalmie avec un recul respectif de 65% et 60%. Des chiffres encore provisoires, à prendre avec précaution d'après les représentants des deux cultes.
Race condition in wget 1.17 and earlier, when used in recursive or mirroring mode to download a single file, might allow remote servers to bypass intended access list restrictions by keeping an HTTP connection open.
360 360degree dslr equipment fieldofview interactive lens news sphere sphereoptics spherepro technology virtualreality vrMichael Zhang


Sphere is a new lens that’s designed to turn any DSLR camera into a 360-degree camera. It captures a full 360-degree view horizontally and a 180-degree field of view vertically.

Created by a startup called the Sphere Optics Company, which is also working on similar lenses for GoPro cameras and smartphones, the Sphere Pro lens is currently one of a kind in the world of DSLRs.


“The (sphere) Pro lens employs a novel, toroidal design,” the company writes. “No other lenses with an equivalent capability are known to exist.”

“The Sphere lens allows users who are currently shooting conventional content to quickly and easily shoot immersive content with a low barrier to entry,” co-founder Rob Englert tells Newsshooter. “A shooter can shoot traditional content, then switch to the Sphere lens and quickly capture full spherical content.”

After mounting the Sphere Pro lens on an interchangeable lens camera (whether DSLR or mirrorless), you point the camera straight up to capture a “full, omnidirectional view” that doesn’t require stitching together multiple views.

Here’s a 4-minute demo reel showing what the lens can capture:

Sphere Pro comes with a Nikon F mount, which can then be adapted to other mirrorless or DSLR cameras by other manufacturers. Specs of the lens include an f/8 aperture, optimal focus at 40 inches, a 35mm full frame image circle, a length of 198mm (~7.8in), and a weight of 1.8kg (~4lbs).

Since the development of the lens is separate from the sensor behind it, Sphere footage will improve as camera sensors improve.

If you’re interested in renting or buying a Sphere Pro lens (a limited quantity are currently available), you can get in touch with the company through its website.

entrepreneurship productivity small-business startup time-managementJason Fried

Probably not. Habits die hard.

When I talk about 40 hours being plenty of time to get great work done, I’ll often get pushback from people starting new businesses.

“40 hours may be fine when you’ve been in business for 10 years, but when you’re starting something new you have to bust your ass for as long as it takes. If it takes 80 hour weeks, then it takes 80 hours weeks.”

I’m calling bullshit.

First, this defense often comes from people who haven’t run a previous business for 10 years. So they don’t know what they’re talking about. They’re imagining a future of leisure — that once a business is sailing, it just keeps going. It’ll just get easier, right?

It actually gets harder. Staying in business is harder than starting a business. If it wasn’t, there’d be a whole heck of a lot more companies out there. But most barely last a few years.

The second argument is that there’s simply more work to do when you’re just getting started. Not true. There’s actually not more work to do when you’re just getting started. It’s just different work. The work changes, it doesn’t go away.

Established businesses have to do everything startups have to do, but they also have more customers to keep happy, more staff to manage (which means more personalities to manage), more expenses to cover, more competition to fend off, more legacy to drag along and navigate around, more mass to maneuver.

Ask anyone with a big business if they’d like to be even bigger, or if they long for the days when they were a little smaller. Most will opt for smaller. Fewer demands, more flexibility, easier decision making, less overall organizational complexity.

So people who get used to working 80 hours don’t cut back. Until life cuts them down. Relationships falter, friends go missing, family is a quick kiss as you’re sprinting out the door, and life happens in the margins.

The habits you form early on carry with you. If you think success requires 80 hours when you get started, you’ll hold on to that mentality. You don’t get used to working 40 when you attribute your success to 80. It’s just not how habits work. We continue doing what we get used to.

Instead of whatever it takes, it’s time to start thinking about what it doesn’t take. There’s so much manufactured busyness in those 80 hours that the real gains from cutting things out, not adding in more.

Ask people in their 30s or 40s who are still putting in long hours why they haven’t been able to cut back. See what they say. Ask them why a few times and get to the root of it.

Most of the habits we form were formed when we were children. When we didn’t have a chance to reflect on what we were doing and set the correct course. As adults starting business we have the capacity to consider the consequences. We should know better. We can do better.

Don’t buy into the myth of a a lot now so you can do a little later. It just doesn’t work that way.

80 hours now, 40 hours later? was originally published in Signal v. Noise on Medium, where people are continuing the conversation by highlighting and responding to this story.

Thomas Claburn

Reasonably secure messenger has, for now, outwitted those who would block it

The latest update of Signal, one of the most well-regarded privacy-focused messaging applications for non-technical users, has just been revised to support a censorship circumvention technique that will make it more useful for people denied privacy by surveillance-oriented regimes.…

Jeff Carlson
360 cameras group lede

A 360-degree camera is great if you want to capture the full view of the summit on Half Dome or take in all of the surrounding architecture in the Piazza San Marco in Venice and share that experience on Facebook or YouTube so friends can pan around a scene and fully be there in the moment. After researching 360-degree cameras for 30 hours and testing four top contenders, we think the Ricoh Theta S is the best affordable, user-friendly entry point into this rapidly-developing new category of photography.

somaElaine Gavin

RayKo Photo Center. (Photo: Google Maps)

The RayKo Photo Center, which offers darkroom services, studio rentals, and workshops for San Francisco photographers, is teetering on the edge of closure.

In a heartfelt email sent to patrons this week, owner Stuart Kogod said that he's “stepping out of the shadows to discuss the future of RayKo Photo Center," because the business isn’t making enough money to survive.

“As the proprietor, I will need to make some significant changes, as I can no longer carry it alone," he writes. Unless he can find someone to help him, the photo center will shut down on April 30th, 2017. 

Kogod's hope is that RayKo can survive by transferring ownership or bringing on a new co-owner or group of co-owners. "With the right person at the helm, someone who has business skills and fire in their belly, RayKo could be a successful enterprise," he writes.

The RayKo gallery. | Photo: RayKo Photo Center

RayKo, which opened its doors at 428 Third St. in 2004, bills itself as “the largest public photographic community center west of the Mississippi," with an array of services for photographers and an in-house gallery. The studio offers a wide array of photography classes, from traditional darkroom techniques to old-school processes like tintyping to modern digital photography. 

Unless he can find some help to save the center, Kogod reports that RayKo’s final gallery exhibit will be the tenth annual International Juried Plastic Camera Show, which will run through April 23rd. The store and gallery will also be modifying its hours beginning in February, closing on Monday and Friday and opening from 12:30-9:30pm Tuesday-Thursday and 10am-7pm on Saturday and Sunday.

RayKo's studios. | Photo: RayKo Photo Center

It's been a tough week for longtime SoMa businesses, with bicycle shop Pacific Bicycle announcing it will close in February and nightclub DNA Lounge also reporting that it's on the brink of closure. But Kogod remains positive. 

“Whatever the future holds, I'd like to thank you for joining me in this wild and fulfilling ride," he writes. “I will try my hardest to continue to keep you informed.”

Thanks to Mike W. and Bala for the tip.

See something interesting while you’re out and about? Text Hoodline and we’ll see what we can find: (415) 200-3233.

<img src="http://feedpress.me/10252/5011320.gif" height="1" width="1"/>

Solid piece from Quartz on the state of the Holocracy experiment at Zappos. I’ve never worked at Zappo’s, but it seemed to me like a healthy culture to boldly engage in different productivity and leadership experiments. However, as reported, it does not appear to be working.

Nearly a third of the company has walked to the door. I don’t know what Zappo’s annual attrition rate has been, but I guess that is higher. When you combine this with the fact that Zappos dropped off the Forbes “Best Companies to Work For” list, you have evidence of a larger systemic problem.

To me, the money quote in the piece is, “Robertson [Holacracy’s creator] says it takes five years for Holocracy to work.” There’s a short list of the larger companies using Holocracy on their wiki, but my question is two fold: what does working mean and what company has successfully done it?


news privacyRick Falkvinge

The EU Supreme Court (European Court of Justice) has ruled that no European country may have laws that require any communications provider to perform blanket indiscriminate logging of user activity, stating in harsh terms that such measures violate the very fundamentals of a democratic society. This finally brings the hated Data Retention to an end, even if much too late. It also kills significant parts of the UK Snooper’s Charter.

This morning, Luxembourg time, the European Court of Justice (ECJ) presented its damning verdict. In a challenge brought by plaintiffs in Ireland and Sweden, it was argued that forcing telecommunications providers – ISPs and telecom companies alike – to log all activity of their users, in case law enforcement may need it later, was simply incompatible with the most fundamental privacy rights laid out in the European Charter of Human Rights. The court agreed wholesale.

These blanket everybody-is-a-suspect laws, originally the brainchild of infamous human rights violators in UK, Ireland, Sweden, and France, were pushed aggressively in the wake of the 2004 Madrid bombings, and were approved by the European Parliament on December 14, 2005, following a scare campaign coupled with the message how they were “necessary to fight terrorism”. The ink on the law was barely dry before they were used to go after ordinary teenagers sharing music and movies instead.

Out of 27 European states, a third never implemented the violations. Some had them revoked by national Supreme or Constitutional courts, like Germany.

The Court states that, with respect to retention, the retained data, taken as a whole, is liable to allow very precise conclusions to be drawn concerning the private lives of the persons whose data has been retained.

The interference by national legislation that provides for the retention of traffic data and location data with that right must therefore be considered to be particularly serious. The fact that the data is retained without the users of electronic communications services being informed of the fact is likely to cause the persons concerned to feel that their private lives are the subject of constant surveillance. Consequently, only the objective of fighting serious crime is capable of justifying such interference. — European Court of Justice (their bolding)

The Court goes on to say that targeted surveillance of people who are currently under individual suspicion of a serious crime was, and remains, legitimate – but not, never, blanket surveillance of everybody and all the time.

This puts an end to 12 years of egregious privacy violations in Europe. The beginning of the end came in April of 2014, when the ECJ declared the Data Retention Directive null and void – they didn’t just rule it cancelled from the date of verdict: they ruled retroactively that it had never existed. That ruling meant that European states were no longer forced to have laws mandating logging of everybody’s activity – but they were not forbidden from having such laws. It went from a federal issue to a states issue.

Until today. From this day, such laws are subject to a blanket ban. Twelve years of bullshit has come to an end.

This also means that the logging-requirement parts of the UK “Snooper’s Charter” are dead as a doornail and subject to the response given in the legal challenge of Arkell v. Pressdram (1971), should the UK government require it.

Privacy remains your own responsibility.

The post Complete Victory: EU Supreme Court Rules Blanket Logging Requirements Blanketly Unconstitutional appeared first on Privacy Online News.

amazon dpdgroup drone french aviation authority gear & gadgets ministry of innovationTom Mendelsohn

(credit: Don McCullough)

The French postal service has been given the go-ahead to start delivering parcels using drones.

France's airspace regulator, the General Directorate for Civil Aviation, cleared the drones for take off. But that doesn't mean French skies will suddenly be abuzz with unmanned aircraft—at present, the drones will only work on a prescribed nine-mile route once a week in the southern region of Provence, as a feasibility test for the tech and regulations.

The trial is being run by DPDgroup, an international subsidiary of French national postal service Le Groupe La Poste. The drone will travel from a pickup point in Saint-Maximin-La-Sainte-Beaume to Pourrières in the Var department, a region that has been chosen because it hosts a number of start-up companies, including a dozen specialising in tech.

DPDgroup—which has been working on the project for more than two years with French drone startup Atechsys—said it was "a new way of addressing the issue of last-mile deliveries, especially when it comes to areas that are difficult to access." The firm is particularly keen to use drones to deliver to remote areas, like mountain villages, islands, and rural areas.

A delivery terminal has been developed that assists drones at take-off and landing, and secures it while it's being loaded and unloaded. After 600 hours of flight time, the drone apparently managed an autonomous delivery across a distance of 8.7 miles, carrying a package that weight 1.5kg, back in September 2015.

This puts the project neck-and-neck with the Amazon drone delivery service that's currently being developed in Cambridge in the UK, and which carried out its first delivery last week.

DPDgroup claims its drone has a range of up to 20km (12.4 miles), and can carry a payload of 3kg and a top speed of 30kmph (18.6mph). Its navigation system has a range of around 50km (31 miles). It's also equipped with a parachute, in case of emergency.

In the UK, Amazon has an agreement with the Civil Aviation Authority to allow the retail giant to operate multiple drones out of line-of-sight. The American FAA has yet to agree to anything similar, but it has recently allowed another company to operate drones beyond line of sight.

This post originated on Ars Technica UK

Read Comments

checklists general performanceVitaly Friedman

Are you using progressive booting already? What about tree-shaking and code-splitting in React and Angular? Have you set up Brotli or Zopfli compression, OCSP stapling and HPACK compression? Also, how about resource hints, client hints and CSS containment — not to mention IPv6, HTTP/2 and service workers?

PRPL Pattern in the application shell architecture

Performance isn’t just a technical concern: It matters, and when baking it into the workflow, design decisions have to be informed by their performance implications. Performance has to be measured, monitored and refined continually, and the growing complexity of the web poses new challenges that make it hard to keep track of metrics, because metrics will vary significantly depending on the device, browser, protocol, network type and latency (CDNs, ISPs, caches, proxies, firewalls, load balancers and servers all play a role in performance).

The post Front-End Performance Checklist 2017 (PDF, Apple Pages) appeared first on Smashing Magazine.

availability design patterns distributed systems fault tolerance message queues messaging microservices ops resilience engineering soa software engineeringTyler Treat

Complex systems usually operate in failure mode. This is because a complex system typically consists of many discrete pieces, each of which can fail in isolation (or in concert). In a microservice architecture where a given function potentially comprises several independent service calls, high availability hinges on the ability to be partially available. This is a core tenet behind resilience engineering. If a function depends on three services, each with a reliability of 90%, 95%, and 99%, respectively, partial availability could be the difference between 99.995% reliability and 84% reliability (assuming failures are independent). Resilience engineering means designing with failure as the normal.

Anticipating failure is the first step to resilience zen, but the second is embracing it. Telling the client “no” and failing on purpose is better than failing in unpredictable or unexpected ways. Backpressure is another critical resilience engineering pattern. Fundamentally, it’s about enforcing limits. This comes in the form of queue lengths, bandwidth throttling, traffic shaping, message rate limits, max payload sizes, etc. Prescribing these restrictions makes the limits explicit when they would otherwise be implicit (eventually your server will exhaust its memory, but since the limit is implicit, it’s unclear exactly when or what the consequences might be). Relying on unbounded queues and other implicit limits is like someone saying they know when to stop drinking because they eventually pass out.

Rate limiting is important not just to prevent bad actors from DoSing your system, but also yourself. Queue limits and message size limits are especially interesting because they seem to confuse and frustrate developers who haven’t fully internalized the motivation behind them. But really, these are just another form of rate limiting or, more generally, backpressure. Let’s look at max message size as a case study.

Imagine we have a system of distributed actors. An actor can send messages to other actors who, in turn, process the messages and may choose to send messages themselves. Now, as any good software engineer knows, the eight fallacy of distributed computing is “the network is homogenous.” This means not all actors are using the same hardware, software, or network configuration. We have servers with 128GB RAM running Ubuntu, laptops with 16GB RAM running macOS, mobile clients with 2GB RAM running Android, IoT edge devices with 512MB RAM, and everything in between, all running a hodgepodge of software and network interfaces.

When we choose not to put an upper bound on message sizes, we are making an implicit assumption (recall the discussion on implicit/explicit limits from earlier). Put another way, you and everyone you interact with (likely unknowingly) enters an unspoken contract of which neither party can opt out. This is because any actor may send a message of arbitrary size. This means any downstream consumers of this message, either directly or indirectly, must also support arbitrarily large messages.

How can we test something that is arbitrary? We can’t. We have two options: either we make the limit explicit or we keep this implicit, arbitrarily binding contract. The former allows us to define our operating boundaries and gives us something to test. The latter requires us to test at some undefined production-level scale. The second option is literally gambling reliability for convenience. The limit is still there, it’s just hidden. When we don’t make it explicit, we make it easy to DoS ourselves in production. Limits become even more important when dealing with cloud infrastructure due to their multitenant nature. They prevent a bad actor (or yourself) from bringing down services or dominating infrastructure and system resources.

In our heterogeneous actor system, we have messages bound for mobile devices and web browsers, which are often single-threaded or memory-constrained consumers. Without an explicit limit on message size, a client could easily doom itself by requesting too much data or simply receiving data outside of its control—this is why the contract is unspoken but binding.

Let’s look at this from a different kind of engineering perspective. Consider another type of system: the US National Highway System. The US Department of Transportation uses the Federal Bridge Gross Weight Formula as a means to prevent heavy vehicles from damaging roads and bridges. It’s really the same engineering problem, just a different discipline and a different type of infrastructure.

The August 2007 collapse of the Interstate 35W Mississippi River bridge in Minneapolis brought renewed attention to the issue of truck weights and their relation to bridge stress. In November 2008, the National Transportation Safety Board determined there had been several reasons for the bridge’s collapse, including (but not limited to): faulty gusset plates, inadequate inspections, and the extra weight of heavy construction equipment combined with the weight of rush hour traffic.

The DOT relies on weigh stations to ensure trucks comply with federal weight regulations, fining those that exceed restrictions without an overweight permit.

The federal maximum weight is set at 80,000 pounds. Trucks exceeding the federal weight limit can still operate on the country’s highways with an overweight permit, but such permits are only issued before the scheduled trip and expire at the end of the trip. Overweight permits are only issued for loads that cannot be broken down to smaller shipments that fall below the federal weight limit, and if there is no other alternative to moving the cargo by truck.

Weight limits need to be enforced so civil engineers have a defined operating range for the roads, bridges, and other infrastructure they build. Computers are no different. This is the reason many systems enforce these types of limits. For example, Amazon clearly publishes the limits for its Simple Queue Service—the max queue depth for standard queues is 120,000 messages and 20,000 messages for FIFO queues. Messages are limited to 256KB in size. Amazon KinesisApache KafkaNATS, and Google App Engine pull queues all limit messages to 1MB in size. These limits allow the system designers to optimize their infrastructure and ameliorate some of the risks of multitenancy—not to mention it makes capacity planning much easier.

Unbounded anything—whether its queues, message sizes, queries, or traffic—is a resilience engineering anti-pattern. Without explicit limits, things fail in unexpected and unpredictable ways. Remember, the limits exist, they’re just hidden. By making them explicit, we restrict the failure domain giving us more predictability, longer mean time between failures, and shorter mean time to recovery at the cost of more upfront work or slightly more complexity.

It’s better to be explicit and handle these limits upfront than to punt on the problem and allow systems to fail in unexpected ways. The latter might seem like less work at first but will lead to more problems long term. By requiring developers to deal with these limitations directly, they will think through their APIs and business logic more thoroughly and design better interactions with respect to stability, scalability, and performance.

softwareBrooks Duncan

If you have updated to macOS Sierra 12.2.2 and use Preview to manipulate scanned PDFs, watch out. There seems to be a bug and the OCR text layer can disappear. I’ve replicated this issue on documents scanned with the Fujitsu ScanSnap and the Doxie Q so far.

In the comments to my blog post about ScanSnap on Sierra, awesome DocumentSnap reader Alex writes this:

Since updating to macOS 10.12.2 I have found that Preview destroys the OCR layer of PDFs scanned and OCR’d with the latest ScanSnap Manager software if you make any sort of edit with Preview (e.g. deleting or reordering pages). After editing and saving with Preview, the PDF is no longer searchable and text is not selectable. Managed to replicate the problem on another Mac running 10.12.2. Doesn’t seem to affect PDFs scanned and OCR’d with other scanners or applications. Just wanted to warn everyone to perhaps wait before updating, and check that they haven’t unwittingly destroyed their OCR if they have already updated.

This was confirmed in the comments by reader Jakub.

Since I hadn’t yet upgraded to 12.12.2, I decided to test with scans before and after upgrading, and since I had a Doxie Q sitting on my desk, I tested with that as well to see if it was a ScanSnap thing. I also tested Preview on a machine with 12.12.1 and a machine with El Capitan.

All ScanSnap scans were done with a ScanSnap iX500 using ScanSnap Manager 6.3 L60. All Doxie scans were done with a Doxie Q exported using Doxie software 2.9.1 (1864).

For the test, I scanned documents on Sierra 12.12.1 and 12.12.2, checked that the PDF was OCRed properly, then deleted a page, saved, and re-opened and checked the text again. Here are the results:

  • Scanned with ScanSnap on 12.12.1 & edited Sierra 12.12.1: OK
  • Scannedwith ScanSnap on 12.12.1 & edited Sierra 12.12.2: GONE
  • Scanned with ScanSnap on 12.12.1 & edited El Capitan: OK
  • Scanned with ScanSnap on 12.12.2 & edited Sierra 12.12.1: OK
  • Scanned with ScanSnap on 12.12.2 & edited Sierra 12.12.2: GONE
  • Scanned with ScanSnap on 12.12.2 & edited El Capitan: OK
  • Scanned with Doxie Q on 12.12.1 & edited Sierra 12.12.1: OK
  • Scanned with Doxie Q on 12.12.2 & edited Sierra 12.12.2 : GONE

As you can see, it seems to be something to do with Preview on macOS Sierra 12.12.2. Alex said that he didn’t see the issue with other scanners, but I ran into it with both ScanSnap and Doxie. Both of those scanners use ABBYY for OCR, so that may be relevant.

If you’ve upgraded to 12.12.2 (or see this issue on another platform!), please let us know in the comments if you see the same thing. I’ll update if a fix/workaround appears.

The post macOS Sierra 12.12.2 – OCR Text Removed with Preview And Scanned PDFs? appeared first on DocumentSnap | Going Paperless and The Paperless Office.

e-journals link rot memento web archivingDavid. (noreply@blogger.com)
At the Fall CNI Martin Klein presented a new paper from LANL and the University of Edinburgh, Scholarly Context Adrift: Three out of Four URI References Lead to Changed Content. Shawn Jones, Klein and the co-authors followed on from the earlier work on web-at-large citations from academic papers in Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot, which found:
one out of five STM articles suffering from reference rot, meaning it is impossible to revisit the web context that surrounds them some time after their publication. When only considering STM articles that contain references to web resources, this fraction increases to seven out of ten.
Reference rot comes in two forms:
  • Link rot: The resource identified by a URI vanishes from the web. As a result, a URI reference to the resource ceases to provide access to referenced content.
  • Content drift: The resource identified by a URI changes over time. The resource’s content evolves and can change to such an extent that it ceases to be representative of the content that was originally referenced.
The British Library's Andy Jackson analyzed the UK Web Archive and found:
I expected the rot rate to be high, but I was shocked by how quickly link rot and content drift come to dominate the scene. 50% of the content is lost after just one year, with more being lost each subsequent year. However, it’s worth noting that the loss rate is not maintained at 50%/year. If it was, the loss rate after two years would be 75% rather than 60%. This indicates there are some islands of stability, and that any broad ‘average lifetime’ for web resources is likely to be a little misleading.
Clearly, the problem is very serious. Below the fold, details on just how serious and discussion of a proposed mitigation.

This work is enabled by the support by Web archives for RFC7089, which allows access to preserved versions (Mementos) of web pages by [url,datetime]. The basic question to ask is "does the web-at-large URI still resolve to the content it did when it was published?".
The earlier paper:
estimated the existence of representative Mementos for those URI references using an intuitive technique: if a Memento for a referenced URI existed with an archival datetime in a temporal window of 14 days prior and after the publication date of the referencing paper, the Memento was regarded representative.
The new paper takes a more careful approach:
For each URI reference, we poll multiple web archives in search of two Mementos: a Memento Pre that has a snapshot date closest and prior to the publication date of the referencing article, and a Memento Post that has a snapshot date closest and past the publication date. We then assess the similarity between these Pre and Post Mementos using a variety of similarity measures.
Incidence of web-at-large URIs
They worked with three corpora (arXiv, Elsevier and PubMed Central) with a total of about 1.8M articles referencing web-at-large URIs. This graph, whose data I took from Tables 4,5,6 of the earlier paper, shows that the proportion of articles with at least one web-at-large URI was increasing rapidly through 2012. It would be interesting to bring this analysis up-to-date, and to show not merely the proportion through time of articles with at least one web-at-large URI as in this graph, but also histograms through time of the proportion of citations that were to web-at-large URIs.
From those articles 3,983,985 URIs were extracted. 1,059,742 were identified as web-at-large URIs, for 680,136 of which it was possible to identify [Memento Pre, Memento Post] pairs. Eliminating non-text URIs left 648,253. They use four different techniques to estimate similarity. By comparing the results they set an aggregate similarity threshold, then:
We apply our stringent similarity threshold to the collection of 648,253 URI references for which Pre/Post Memento pairs can be compared ... and find 313,591 (48.37%) for which the Pre/Post Memento pairs have the maximum similarity score for all measures; these Mementos are considered representative.
Then they:
use the resulting subset of all URI references for which representative Mementos exist and look up each URI on the live web. Predictably, and as shown by extensive prior link rot research, many URIs no longer exist. But, for those that still do, we use the same measures to assess the similarity between the representative Memento for the URI reference and its counterpart on the live web.
This revealed that over 20% of the URIs had suffered link rot, leaving 246,520. More now had different content type, or no longer contained text to be compared. 241,091 URIs remained for which:
we select the Memento with an archival date closest to the publication date of the paper in which the URI reference occurs and compare it to that URI’s live web counterpart using each of the normalized similarity measures.
The aggregated result is:
a total of 57,026 (23.65%) URI references that have not been subject to content drift. In other words, the content on the live web has drifted away from the content that was originally referenced for three out of four references (184,065 out of 241,091, which equals 76.35%).
Another way of looking at this result is that the authors could find only 57,026 out of 313,591 URIs for which matching Pre/Post Memento pairs existed could be shown not to have rotted, or 18.18%. For 334,662 out of 648,253 references with Pre/Post Memento pairs, or 51.63%, the referenced URI changed significantly between the Pre and Post Mementos, showing that it was probably unstable even as the authors were citing it. The problem gets worse through time:
even for articles published in 2012 only about 25% of referenced resources remain unchanged by August of 2015. This percentage steadily decreases with earlier publication years, although the decline is markedly slower for arXiv for recent publication years. It reaches about 10% for 2003 through 2005, for arXiv, and even below that for both Elsevier and PMC.
Similarity over time at arXiv
Thus, as this arXiv graph shows, they find that, after a few years, it is very unlikely that a reader clicking on a web-at-large link in an article will see what the author intended. They suggest that this problem can be addressed by:
  • Archiving Mementos of cited web-at-large URIs during publication, for example using Web archive nomination services such as http://www.archive.org/web.
  • The use of "robust links":
    a link can be made more robust by including:
    • The URI of the original resource for which the snapshot was taken;
    • The URI of the snapshot;
    • The datetime of linking, of taking the snapshot.
The robust link proposal describes the model of link decoration that Klein discussed in his talk:
information is conveyed as follows:
  • href for the URI of the original resource for which the snapshot was taken;
  • data-versionurl for the URI of the snapshot;
  • data-versiondate for the datetime of linking, of taking the snapshot.
But this has a significant problem. The eventual reader will click on the link and be taken to the original URI, which as the paper shows, even if it resolves is very unlikely to be what the author intended. The robust links site also includes JavaScript to implement pop-up menus giving users a choice of Mementos,which they assume a publisher implementing robust links would add to their pages. An example of this is Reminiscing About 15 Years of Interoperability Efforts. Note the paper-clip and down-arrow appended to the normal underlined blue link rendering. Clicking on this provides a choice of Mementos.
The eventual reader, who has not internalized the message of this research, will click on the link. If it returns 404, they might click on the down-arrow and choose an alternate Memento. But far more often they will click on the link and get to a page that they have no way of knowing has drifted. They will assume it hasn't, so will not click on the down-arrow and not get to the page the author intended. The JavaScript has no way to know that the page has drifted, so cannot warn the user that it has.
The robust link proposal also describes a different model of link decoration:
information is conveyed as follows:
  • href for the URI that provides the specific state, i.e. the snapshot or resource version;
  • data-originalurl for the URI of the original resource;
  • data-versiondate for the datetime of the snapshot, of the resource version.
If this model were to be used, the eventual reader would end up at the preserved Memento, which for almost all archives would be framed with information from the archive. This would happen whether or not the original URI had rotted or the content had drifted. The reader would both access, and would know they were accessing, what the author intended. JavaScript would be needed only for the case where the linked-to Memento was unavailable, and other Web archives would need to be queried for the best available Memento.
The robust links specification treats these two models as alternatives, but in practice only the second provides an effective user experience without significant JavaScript support beyond what has been demonstrated. Conservatively, these two papers suggest that between a quarter and a third of all articles will contain at least one web-at-large citation and that after a few years it is very unlikely to resolve to the content the article was citing. Given the very high probability that the URI has suffered content drift, it is better to steer the user to the contemporaneous version if one exists.
With some HTML editing, I made the links to the papers above point to their DOIs at dx.doi.org, so they should persist, although DOIs have their own problems. I could not archive the URLs to which the DOIs currently resolve, apparently because PLOS blocks the Internet Archive's crawler. With more editing I decorated the link to Andy Jackson's talk in the way Martin suggested - the BL's blog should be fairly stable, but who knows? I saved the two external graphs to Blogger and linked to them there, as is my habit. Andy's graph was captured by the Internet Archive, so I decorated the link to it with that copy. I nominated the arXiv graph and my graph to the Internet Archive and decorated the links with their copy.
The difficulty of actually implementing these links, and the increased understanding of how unlikely it is that the linked-to content will be unchanged, reinforce the arguments in my post from last year entitled The Evanescent Web:
All the proposals depend on actions being taken either before or during initial publication by either the author or the publisher. There is evidence in the paper itself ... that neither authors nor publishers can get DOIs right. Attempts to get authors to deposit their papers in institutional repositories notoriously fail. The LOCKSS team has met continual frustration in getting publishers to make small changes to their publishing platforms that would make preservation easier, or in some cases even possible. Viable solutions to the problem cannot depend on humans to act correctly. Neither authors nor publishers have anything to gain from preservation of their work.
It is worth noting that discussions with publishers about the related set of changes discussed in Improving e-Journal Ingest (among other things) are on-going. Nevertheless, this proposal is more problematic for them. Journal publishers are firmly opposed to pointing to alternate sources for their content, such as archives, so they would never agree to supply that information in their links to journal articles. Note that very few DOIs resolve to multiple targets. They would therefore probably be reluctant to link to alternate sources for web-at-large content from other for-profit or advertising-supported publishers, even if it were open access. The idea that journal publishers would spend the effort needed to identify whether a web-at-large link in an article pointed to for-profit content seems implausible.
Thomas Ricker

Today LG launched another salvo in the war against giant ugly televisions with the announcement of the ProBeam laser projector. It's rare to find a projector this bright that's small enough to hold in one hand — and this one runs webOS. Unlike most projectors, LG's ProBeam HF80J is long-and-tall instead of short-and-wide, and looks like it stepped off an art deco boardwalk. Better yet, it's rated at 2,000 lumens which is bright enough to be used during the day in most living room setups. But at just 2.1kg (4.6 pounds), it's small enough to take anywhere.

Best of all, webOS makes this projector smart. While we normally turn our noses up at smart televisions, smart projectors let you take this giant display...

Continue reading…

homestead scissors stationeryjason
Scissors simplified thanks to Alessandro Stabile. A perfect addition to any studio, the Lama Scissors may be the most minimal form a pair of scissors will ever take. Reminiscent of an old schoolhouse railing, the Lama is two identical stainless steel halves twisted into shape. The form may be simple, but they’re comfortable to grip [...]
economics & business
Geert Hofstede's "Culture's Consequences" is one of the most influential management books of the 20th century. With well over 80,000 citations, Hofstede argues that 50 percent of managers' differences in their reactions to various situations are explained by cultural differences. Now, a researcher at the University of Missouri has determined that culture plays little or no part in leaders' management of their employees; this finding could impact how managers are trained and evaluated globally.
The DumpModeEncode function in tif_dumpmode.c in the bmp2tiff tool in LibTIFF 4.0.6 and earlier, when the "-c none" option is used, allows remote attackers to cause a denial of service (buffer over-read) via a crafted BMP image.
api apps browsers mobileStéphanie Walter

Apple taught us, "There's an app for that." And we believed it. Why wouldn't we? But time has past since 2009. Our mobile users have gotten more mature and are starting to weigh having space for new photos against installing your big fat e-commerce app. Meanwhile, mobile browsers have also improved. New APIs are being supported, and they will bring native app-like functionality to the mobile browser.


We can now access video and audio and use WebRTC to build a live video-chat web apps directly in the browser, no native app or plugin required. We can build progressive web apps that bring users an almost native app experience, with a launch icon, notifications, offline support and more. Using geolocation, battery status, ambient light detection, Bluetooth and the physical web, we can even go beyond responsive web design and build websites that will automagically adapt to users' needs and context.

The post The (Not So) Secret Powers Of The Mobile Browser appeared first on Smashing Magazine.


I had been procrastinating making the family holiday card. It was a combination of having a lot on my plate and dreading the formulation of our annual note recapping the year; there were some great moments, but I’m glad I don’t have to do 2016 again. It was just before midnight and either I’d make the card that night or leave an empty space on our friends’ refrigerators. Adobe Illustrator had other ideas:

Unable to set maximum number of files to be opened.

I’m not the first person to hit this. The problem seems to have existed since CS6 was released in 2016. None of the solutions was working for me, and — inspired by Sara Mauskopf’s excellent post — I was rapidly running out of the time bounds for the project. Enough; I’d just DTrace it.

A colleague scoffed the other day, “I mean, how often do you actually use DTrace?” In his mind DTrace was for big systems, critical system, when dollars and lives were at stake. My reply: I use DTrace every day. I can’t imagine developing software without DTrace, and I use it when my laptop (not infrequently) does something inexplicable (I’m forever grateful to the Apple team that ported it to Mac OS X).

First I wanted to make sure I had the name of the Illustrator process right:

$ sudo dtrace -n ‘syscall:::entry{ @[execname] = count(); }’
dtrace: description ‘syscall:::entry’ matched 500 probes
pboard 1
watchdogd 2
awdd 3
com.apple.WebKit 7065
Google Chrome He 7128
Google Chrome 8099
Adobe Illustrato 36674

Glad I checked: “Adobe Illustrato”. Now we can be pretty sure that Illustrator is failing on setrlimit(2) and blowing up as result. Let’s confirm that it is in fact returning -1:

$ sudo dtrace -n 'syscall::setrlimit:return/execname == "Adobe Illustrato"/{ printf("%d %d", arg1, errno); }'
dtrace: description 'syscall::setrlimit:return' matched 1 probe
CPU     ID                    FUNCTION:NAME
  0    532                 setrlimit:return -1 1

There it is. And setrlimit(2) is failing with errno 1 which is EPERM (value too high for non-root user). I already tuned up the files limit pretty high. Let’s confirm that it is in fact setting the files limit and check the value to which it’s being set. To write this script I looked at the documentation for setrlimit(2) (hooray for man pages!) to determine that the position of the resource parameter (arg0) and the type of the value parameter (struct rlimit). I needed the DTrace copyin() subroutine to grab the structure from the process’s address space:

$ sudo dtrace -n 'syscall::setrlimit:entry/execname == "Adobe Illustrato"/{ this->r = *(struct rlimit *)copyin(arg1, sizeof (struct rlimit)); printf("%x %x %x", arg0, this->r.rlim_cur, this->r.rlim_max);  }'
dtrace: description 'syscall::setrlimit:entry' matched 1 probe
CPU     ID                    FUNCTION:NAME
  0    531                 setrlimit:entry 1008 2800 7fffffffffffffff

Looking through /usr/include/sys/resource.h we can see that 1008 corresponds to the number of files (RLIMIT_NOFILE | _RLIMIT_POSIX_FLAG). Illustrator is trying to set that value to 0x7fffffffffffffff or 2⁶³-1. Apparently too big; I filed any latent curiosity for another day.

The quickest solution was to use DTrace again to whack a smaller number into that struct rlimit. Easy:

$ sudo dtrace -w -n 'syscall::setrlimit:entry/execname == "Adobe Illustrato"/{ this->i = (rlim_t *)alloca(sizeof (rlim_t)); *this->i = 10000; copyout(this->i, arg1 + sizeof (rlim_t), sizeof (rlim_t)); }'
dtrace: description 'syscall::setrlimit:entry' matched 1 probe
dtrace: could not enable tracing: Permission denied

Oh right. Thank you SIP. This isa new laptop (at least a new motherboard due to some bizarre issue) which probably contributed to Illustrator not working when once it did. Because it’s new I haven’t yet disabled the part of SIP that prevents you from using DTrace on the kernel or in destructive mode (e.g. copyout()). It’s easy enough to disable, but I’m reboot-phobic — I hate having to restart my terminals — so I went to plan B: lldb.

First I used DTrace to find the code that was calling setrlimit(2): using some knowledge of the x86 ISA/ABI:

$ sudo dtrace -n 'syscall::setrlimit:return/execname == "Adobe Illustrato" && arg1 == -1/{ printf("%x", *(uintptr_t *)copyin(uregs[R_RSP], sizeof (uintptr_t)) - 5) }'
dtrace: description 'syscall::setrlimit:return' matched 1 probe
CPU     ID                    FUNCTION:NAME
  0    532                 setrlimit:return 1006e5b72
  0    532                 setrlimit:return 1006e5b72

I ran it a few times to confirm the address of the call instruction and to make sure the location wasn’t being randomized. If I wasn’t in a rush I might have patched the binary, but Apple’s Mach-O Object format always confuses me. Instead I used lldb to replace the call with a store of 0 to %eax (to evince a successful return value) and some nops as padding (hex values I remember due to personal deficiencies):

(lldb) break set -n _init
Breakpoint 1: 47 locations.
(lldb) run
(lldb) di -s 0x1006e5b72 -c 1
0x1006e5b72: callq  0x1011628e0     ; symbol stub for: setrlimit
(lldb) memory write 0x1006e5b72 0x31 0xc0 0x90 0x90 0x90
(lldb) di -s 0x1006e5b72 -c 4
0x1006e5b72: xorl   %eax, %eax
0x1006e5b74: nop
0x1006e5b75: nop
0x1006e5b76: nop

Next I just process detach and got on with making that holiday card…

DTrace Every Day

DTrace was designed for solving hard problems on critical systems, but the need to understand how systems behave exists in development and on consumer systems. Just because you didn’t write a program doesn’t mean you can’t fix it.