We organize our work around stories, that are supposed to present a user requirement, and tech spikes, that we use to figure out how to do something. How do you get architecture/infrastructure work into a sprint? I mean things like "re-implement X using our nifty new architecture" or "set up a new server for QA"?
You know the joke that goes, "The first thing you need to know about recursion, is recursion"? The first thing you need to know about clear agile answers is that agile has no clear answers. Every answer is, "It depends." The rap is that you're making up as you go along, which is not really a fair summary; but it's also true in some ways and is the source of much of its value.
For me, what makes an agile solution agile, is rapid increments with honest re-evaluation, plus as much empirical evidence as can be gathered at low cost. It's the equivalent of numerical analysis through successive approximation.
Okay, I'm two paragraphs in and still haven't addressed the specifics. I've seen architectural overhead/engineering costs addressed several ways. AFAIK there's no general consensus. I don't have any personal preference.
1. Don't measure overhead at all. The point of doing stories with some sort of scoring system, is that you eventually get an approximate team velocity, which helps you predict what you can commit to and when you're behind. The overhead costs of refactoring, deployment, vacation, research, training, conferences, organizational reports/meetings, nosepicking, etc., will come out in the wash. This is why velocity is calculated on actual points delivered; the metric doesn't really care what impedes the delivery. It makes you very empirical, very quickly. The don't-measure approach works well when there's a fairly random distribution of overhead. It also has an enviably low cost of bookkeeping. The downside is, if things get bursty and you flop an iteration due to unusual overhead, there's not much paper trail for management. Usually depends on how much trust and track record you have.
2. Budget X items per iteration for overhead, where X is capped. This is sort of a service-level agreement approach. It acknowledges that devs never get as much time as they wish they could have to do things right. But, like an SLA, it also keeps them from starving. It won't work well if the dev's can't contain themselves, i.e, they "estimate" Y hours but invariably do every task to completion, regardless of budget. Also, there's a risk that the time taken to argue about what should be at the top of the dev list will eat up a significant portion of the budget. (Agile's not too hot, IMO, at dealing with lack of consensus, or any other issue that requires actual personnel management skills. It's an engineering practice that has a few social dimensions, but it doesn't really help you figure out what to do when teams are dysfunctional. Its main contribution on that front is that it will yield empirical evidence, fairly rapidly, that they ARE dysfunctional.) (It also doesn't dust your cubicles or clean little bits of food out of your keyboard. In other words, hard problems remain hard, and grunty problems remain grunty.)
3. Attach dev costs to customer-visible tasks. This doesn't always have sensible semantics. Deferred dev costs can get suddenly bursty and have their own crosscutting urgency ("the server has 4GB of disk left; avg response time has already increased by 60X and in 18 hours it will crash dead dead dead"). I'm told it works okay if you don't have deferred maintenance--but I've never worked on a project that didn't.
4. Let dev tasks compete straight up in the planning game (or whatever budgeting scheme you use) against customer-visible tasks. Maximal visibility, at the cost of added complexity to planning. Many non-dev stakeholders will just wish the dev costs would go away, and wonder (loudly) why devs don't just work dev magic and get everything instantly right. It's the most honest, empirical, manageable way to go, but it's vulnerable to a number of not-too-rational, impulsive responses and political infighting that developer-types are not usually good at winning.
Okay, I said I didn't have a preference, but the most sensible approach I've actually been a part of was a hybrid of 2 and 4. This was actually a Lean or Kanban kind of agile, where we organized work around queues that pulled tasks forward, from planning to work-in-progress to demo to deployed-and-done. Most of the dev effort was devoted to the mainline queue, but there were two auxiliary queues: one for pure dev tasks, and another for customer-support issues that we couldn't itemize at the beginning of the iteration, but based on experience, we knew were likely to crop up at some point mid-cycle. The overall experience of that scheme was not especially better than any other, but it did deal rather sensibly with overhead. Dev overhead and emergent customer-support tasks had enough visibility in the metrics that management could see how they impacted velocity. And, the advocates for any given task--both devs and the poor bastards who had customers waiting for them to return their calls--had some visibility into how much work had already happened per iteration on their queues, plus ready access to they pending queue so that they could quickly prioritize new items relative to old. (We kept tasks on 3x5 cards, in physical queues on a wall full of magnetic whiteboards.) The easy visibility had a rather natural self-governing effect toward the end of any iteration: people have a hard time standing up in the open, asking for the 13th special favor of the month.