Skip to content

Specs Are Valuable!

Whew. Been quite awhile since I’ve written something here, but it’s largely because I haven’t had much to write; the holiday season pretty much removed any desire from the business for new additions to the database or to the front-end code, so there wasn’t much for me to do at all. Nevertheless, I was here 40 hours a week, literally doing nothing all day. It was quite painful, but I managed.

Starting last week, however, we’ve faced a new need for programming to be done within our system; namely, Amazon has decided that books and plush dolls need to be specially packaged when sent to them. One would think that this wouldn’t really require any programming to be done; the definition of a book or plush is rather straightforward, and the warehouse personnel should be quite clear. Not so, however; I’ve had to code in additional columns to some tables to denote whether an item will require special preparation or not, and add in a new form to our inventory scanning program to accommodate the need to denote whether an item needs special preparation or not. Still, very straightforward, or so one would think.

Over the last week, these seemingly simple additions (which were at first met with a joyous “Very good! Thank you! That’s exactly what we needed!”) have needed to be revised more than 50 times, by no stretch of the truth. Revisions have ranged from needing to change from simply denoting whether an item needs special preparation to what sort of special preparation it needs, and from needing to mark an item as “yes” or “no” in regards to needing preparation to marking them as “yes”, “no”, or “maybe”, and then from there to simply denoting every item as “maybe” needing preparation. It’s been quite a mess.

The point is, it would be delightful if management had held off on asking me to develop these modifications, and instead spent about an hour thinking the whole process over, drawing up a list of exactly what their needs were, and then handing that to me to code in. I would have been able to get to work and do everything in one shot, instead of having dozens of cases of thinking that I’m done with the whole thing, only to have them return to make me undo what I’ve just done, only to do it differently, then later to have it undone and redone yet again. It would save everyone some time, and it would’ve changed this week-long multi-revision project into a mere one-day coding session. Unfortunately, it seems the general trend at this business doesn’t quite follow that logic quite so well.

Tidying up…

Whew. Well, after the nonsense that ensued from the API mangling debacle (which I cleared up personally through a talk with the shipping company, confirming my suspicions on the matter; it didn’t seem like such a thing would happen otherwise), and a relaxing Thanksgiving interrupted by an 8-in-the-morning call from the business owner causing me to have to cancel my plans with my family and return home early to troubleshoot something that was actually the function of our usual business rules, I’m back in the office. And maybe slightly bitter. *Ahem*.

As the business is now entering the Christmas season, we’re seeing a huge swell in sales, naturally; as a result, we’ve had some hiccups with order processing, as the suppliers can’t keep pace with dozens of sellers needing huge quantities of products. As such, after some discussion with management, I’ve cut back on our listing of supplier products to reduce the potential for selling inventory we don’t have, and I’ve also implemented a view with some joins between our orders, inventory, and suppliers; with this, I can keep a constant view of how our inventory is doing throughout the day, and report any abnormalities to management to have them remedied as soon as possible.

I’ve also begun tracking the suppliers’ inventory fluctuations throughout the day with a quick query that compares our previous inventory from them with our latest download, in order to find out when their inventory experiences the most changes. Currently, that seems to happen at around 9 in the morning in our time, but I’m going to continue to do more monitoring. It’s important that we get the most accurate picture of their inventory as possible, so we can put out as many products as possible while still preventing the sales of phantom inventory.

There’s probably not going to be too much further development on our systems due to the business of the holiday season, but I’ll be watching the database closely for problems and studying up on more programming in the downtime, so I’ll be keeping busy, at least.

Ranting Again.

*Sigh*. I figured I wouldn’t have the need of another rant post, but apparently today’s changed that. There’s a few situations around here that are no less than infuriating, frankly, and it’s driven me up the wall again.

Today, my supervisor brought to my attention that the API calls I’ve written to get us shipping charge information for our deliveries are off by quite some margin. This is, of course, a problem, one that I’d like to fix immediately, since it impacts our shipping decisions greatly. However, as a double-check, I ran one of the offending orders through the API again. The numbers came our vastly different this time; for two of the methods, the price had dropped by $4. This was without changing any of the information on the order; clearly, something within the API was functioning a bit abnormally. As a means of triple-checking, I looked at the information supplied to the API in both cases; indeed, it was identical.

Adding to this is the fact that the business owner had met with one of the shipping companies about inconsistencies we’d been seeing in shipping rates just last week; the representatives were alarmed, and checked the rates themselves, noting that the API was indeed returning rates that it should not be returning.

I decided I’d like to see exactly how inconsistent the API was functioning, and thus I ran the offending order through a few more times. Each time, the price returned was different, sometimes by a few cents, sometimes by tens of cents, sometimes by a dollar. Again, this is without changing any order information.

Presenting this to my supervisor, she proceeded to discuss it with the business owner, who suggested getting our shipping representative on board to talk to the API programming team. My supervisor disagreed; in the face of the evidence that the API was malfunctioning, she squarely placed the blame on my coding.

I am now sitting here trying to figure out some way to explain myself out of this situation, but I have realized there isn’t a way out. The only answer my supervisor will accept, as I’ve learned from previous incidents, is that my coding is indeed wrong, and I’ll need to fix it. This is despite the fact that I have very soundly observed that the API is just not functioning properly, a fact further reinforced by the other shipping API returning the same results on each call to it using the same order. However, nothing will satisfy her besides me bowing my head and admitting that I coded this process incorrectly.

You’ll have to excuse me if I’m just slightly infuriated by this. She has no programming experience whatsoever, and the facts are rather concrete, and yet she believes she knows more in this situation, and that I’m indisputably coding poorly. To say this is insulting is putting it lightly.

On the other hand, I’m seriously considering finding employment elsewhere now, so that’s a start. Considering the hourly wages here are horrifyingly low for a programmer to be saddling themselves with, perhaps this whole debacle comes with good timing.

Oddities…

Well, I certainly haven’t done much programming this week :p. The business is adding on another company to our data analysis (and paying them three times my hourly wages, to boot, which I feel is stiflingly ironic, but I digress), and as such, I’m having to take a look at our sales data and become something of an inventory manager for awhile. Definitely not my area of expertise, but doing this has given me insight on the business’s practices, which, as I am now aware, are somewhat disturbing.

The company we’ve hired is analyzing our sales information on Amazon, and finding out which items we should be ordering more of. Case in point, we ordered 12 of a certain item and sent them all to Amazon; the day they got there, all 12 sold. From this, we could extrapolate that we could sell 360 in a given month, since 12 should sell each day. Naturally, if we wish to guard ourselves against over-committing, we could halve that number, or reduce it even further; however, instead, we simply send 12 a week. They all sell very quickly, naturally, and we sell 48 a month. Why, then, do we not send more?

The reason, as I’ve discovered, is that the business is too terrified and too short on resources to do just that. The ordering process is pretty much a skin-of-our-teeth sort of thing; we’ll order 30 days’ worth of inventory from a supplier, and we’ll order to the point where, if we do not sell what we’ve ordered within those 30 days, we won’t be able to pay the supplier back. Hence, there is terror in the ordering system, in that if we order too much and it doesn’t sell, we’re doomed; we could potentially lose our contract with the supplier.

The question, then, is why do we commit so much to our orders? A huge amount of our capital goes towards stocking new products; the product research team (if you could call it that) investigates products and adds new ones daily based on what’s selling well on the marketplaces at present. However, these items may or may not have a viable long-term sales future; our best sellers, on the other hand, have years of sales reliability. Why don’t we simply stock more of those, and stop throwing our investments to the wind on items that could sell explosively or potentially not sell at all? By following this logic, the business accrued $100,000 in debt last Christmas, simply by snatching up as many new products as it possibly could to accommodate the Christmas sales rush; however, the vast majority of these products fizzled out with no sales. Meanwhile, our reliable sellers showed a huge spike in sales, them promptly flatlined because we ran out of inventory and didn’t have the money to buy more.

I am certainly not a business major, nor do I purport to have anything resembling sound business knowledge, but this seems to be a plainly obvious problem. If we simply held our ground and heavily stocked the items that we know will sell (for example, our best seller has reliably moved at least 1,000 units per year for the last three years, and could potentially move more, if we didn’t keep running out of inventory on it), and gradually incorporated new products, I would say we would have much better returns on our investments. Instead, we sparsely stock our biggest sellers, and bank everything on the success of new products, out of fear of the risk that heavily stocking on a few items might bring. But that doesn’t solve the problem; instead, we are merely transferring the risk from stocking heavily to stocking things that we don’t know will sell. In fact, not only are we transferring that risk, we may indeed be amplifying it; if we know a given item will sell a thousand units in a year, and could even sell more, why are we instead banking on something that we have no idea about?

This quandary is quite distressing to observe, and the more senior workers here have tried to negotiate with the owner about it; indeed, even the new company we’ve hired on has pointed out the folly in our business practices, but these suggestions have gone unheeded. I feel like I’m watching a ship sink, with holes punched in the hull that could easily be patched with onboard supplies, but those supplies are instead being used to gather ambergris to make a balsamic vinaigrette perfume; it might be the next fashion extravaganza, or it could be looked at with an expression resembling The Scream. Silly analogies aside, this method of running a business, at least to me, seems highly unsound; but it’s not my field of expertise, so I may be entirely wrong in my observation.

Onward!

This week’s been fairly busy as well, it seems!  After re-modifying my sales data generator that I cooked up last week, the owner wanted a new report, as well; one that would break the last 365 days into 52 week-long periods, and add up the sales of all of our products for each week, then find the week that had the most sales.  That…  Was not remotely close to something I knew how to do; the best solution I could devise was one that required a temp table for each week-long period, which would probably win an award or two for the worst-designed query ever :p.

Thankfully, the fine volunteers at SQLServerCentral were willing to help me once again!  In this thread (http://www.sqlservercentral.com/Forums/Topic1378284-391-1.aspx), Ray M gave me a hint to a solution.  The initial posting was very inefficient and returned a ton of extraneous data, which was probably the point…  If I truly understood what was going on, I should’ve been able to chisel it down to a streamlined and efficient result set, and I managed it after some fiddling.  Switching from a table variable to a temp table and using a primary key, as well as dumping out all of the week-long periods that I didn’t need (as the initial solution created a timespan every time a sale happened), cut the query’s run-time immensely; it originally took 12 seconds to complete on our server, but after the adjustments, it finished its job in less than a second.  Immense thanks again to Ray M; without his help, I’d be stuck on this problem for weeks, and it gave me a great chance to do some more performance tuning, which is probably one of my favorite things to do with SQL Server.

The coming weeks are going to see the business shifting gears and entering into holiday-season workloads, which means that I’ll probably see less work headed my way; the business will need to focus more on handling the masses of incoming orders, shipping tasks, and so forth.  I’ll still be tinkering with performance metrics and trying to devise a new procedure or two to help out, and researching as much as I can as well.

Code, code!

Whew!  This week’s introduced a new coding task to my workload, which I appreciate quite a bit.  The owner wanted a report of our sales for the year, as well as aggregated sales data for the past three years.  I accomplished this rather easily with some aggregation and temp tables, producing three years’ worth of data in a single table with an execution time of about 3 seconds and plenty of good index usage.

From here, though, he wanted more features; namely, the ability to enter a number of days to look back for sales to created a timeframe window of sales, as well as the ability to look at specific product lines by entering their manufacturer prefix or a specific item number.  This was a bit more challenging; for this, I’d need some dynamic SQL and a parameterized stored procedure.  I hadn’t written either of these beyond rather simple practice exercises, but this would be a simple case as well; the existing report would stay as-is, but it would have the additional condition that, if parameters were supplied, these would be passed to the WHERE clause and used for the search.

Naturally, however, I couldn’t just do this blindly; tossing user input into the WHERE clause badly is just asking for SQL injection attacks, and even though the platform we’re using is strictly for in-house usage, I can’t use that as an excuse to be lazy.  Thankfully, I’d read up on Gail Shaw’s excellent articles on SQL injection at http://sqlinthewild.co.za/, which gave me ideas on how to prevent such attacks.  After coding up a solution that seemed to be airtight enough, I tried to hit the database with SQL injection through the frontend in various ways to destroy a testing table.  None of the methods I’d dug up worked, and as such, I’m confident that I’ve safeguarded the database from attack from this procedure.

On top of that, since parameters are being used in the procedure, I wanted to make sure that parameter sniffing wouldn’t be an issue.  Gail also has excellent posts about that issue, and after reading through them again, I decided that using OPTION RECOMPILE in the procedures would be sufficient.  The procedure wouldn’t be called more than a dozen times a day, if even that much, so the overhead of recompilation wouldn’t be particularly severe.  This way, I could completely bypass the risk of parameter sniffing while maintaining an efficient querying method.

Finally, the owner had another request; use the results from the procedure to suggest what we should order to fulfill our inventory needs for a given period of time.  That was also quite simple, as I just needed to take the results of the procedure and subtract our current inventory from the sales; if the calculation produced a negative number, then we have more inventory than sales, and the item doesn’t need more inventory for today.  Otherwise, we could use some more inventory, and the result would be the suggested amount to order to provide for the given timeframe’s worth of sales.

Overall, this was a nice week’s worth of work; I got to apply several practices that I’d read about but didn’t have much occasion to seriously implement, and the results have turned out to be quite favorable.  There’s doubtlessly going to be further modifications and additions requested on this project, but I believe I’ve developed a solid base to work on.

EDIT: And there’s the complication that I was expecting but hoping wouldn’t happen! The end-users decided today that the report should use an entirely different set of numbers, ones that have to be calculated and thus aren’t part of what I’ve worked on all week. This, indeed, is the reason why a well-thought-out and precisely written spec would be great to have before starting on any coding project. Oh well, one week’s worth of work is junk now, so time to begin working on the new report.

Refactor Yet Again!

Whew. After some unpleasantness there with having to work a weekend where I had plans (and what do you know, they didn’t even need me at the office! I still had to stick around for 4 hours though…), I’ve been on a diligent refactoring binge. There wasn’t much to do last week, so I sat down and examined the routines we run on a daily basis to find any performance issues.

The first thing I looked for were indexes that were going unused. When I created my own tables in SQL Server, I assigned clustered and nonclustered indexes to each table based on how I figured the optimizer would make use of them. In most cases, I was quite accurate; however, on some tables, the indexes weren’t being used at all, or they were being written to far more often than they were being used. Ran a few tests with the indexes, and things usually completed in less than a second, or about 10 seconds in the case of our heavy, once-an-hour routine for inventory quantity generation. After removing the unneeded indexes, we were still completing those tasks in the same amount of time; hence, it was apparent that the indexes weren’t really necessary.

Afterwards, I added some new indexes after examining the routines more closely, and I nailed things down much more steadily on this pass. Most of the indexes I created started picking up a good number of seeks, with a few scans here and there, and they required little in the way of writes; some didn’t need any writes at all, which was great! Checking into our processes afterwards, the sub-second ones were still just as good, but our 10-second process dropped to 6 seconds. 40% increase, whee! Sure, on a 10-second routine that’s run once an hour, it’s not particularly impressive, but there’s no telling how these procedures will scale out in the future; I’d much rather make preparations to absorb any future shocks, as it were, rather than sitting complacently.

Hopefully I’ll have a bit more to work with next week, and maybe I’ll get a bit more time to re-examine our optimization as well; things are looking better in most places, but I’d like to be sure I’ve reinforced our coding everywhere I can manage it.

Bit of a Rant…

Argh. This week has been going fairly well, with more programming-related things being the majority of my workload, but today’s slammed down a rather unpleasant addition to the week. Namely, I’m being expected to work on Sunday this week. I know, I know, #firstworldproblems.

However, this is particularly grating to me because of the nature of the work I’m going to be doing. My supervisor is going to be uploading a file to one of our selling channels to change our handling time from three-day shipping to one-day shipping. That’s what I’m there for. No, not the file; in case uploading the file makes our system crash and burn. Yes, let that sink in. In case uploading a file to someone else’s system makes our system break down. Huh?

This is also accompanied by other extreme absurdities. For example, I’ll be working a full 8 hours that day. Before the file is uploaded, we’ll have an hour-long meeting to discuss what effects the upload will have (uh, it’ll change our three-day listings to one-day listings?), and how the file should be uploaded (perhaps we should just use the channel’s upload interface?). Why we need an hour-long discussion on these topics, I have no idea.

I’ve had to work weekends here before, and it’s been largely the same issue. The first time, it was when my supervisor reset our router because of a network issue, even though our network tech told her explicitly to never restart the router. The restart destroyed our network infrastructure, and I have no experience in networking, so she called a networking expert in. Over the course of two days, he managed to get our network up and running again. In the meantime, I was there too. I did nothing useful. Again, I don’t understand networking; I went to school to be a DBA, not a network admin. But in the eyes of the supervisor and owner, “computers are really simple, so if you watch him work on it, you should be able to learn how he does it!”. Nope. Didn’t learn a thing. Also, we actually have our own in-house networking tech, but he does nothing but warehouse work. It’s been that way for him since I started working, because “you’re younger, so you know more about computers!”. Sometimes, I can’t believe that such outmoded and surreal thinking guides our business practices.

Honestly, I wouldn’t mind doing this sort of thing if I could do it remotely, as well, but I’m not allowed to. I truly have no idea why I’m being called in for an eight-hour shift on one of my days off to literally babysit somebody as they do a half-dozen mouse clicks to upload a file. It’s degrading, a waste of my time, and rather infuriating, considering I already had plans for the weekend. This is probably little more than an angsty rant with no real purpose, but seriously, I can’t fathom why I’m going to be dragged back to work for a task any simpleton could perform.

Back to Business!

Whew! These last two weeks have been rather refreshing. Rather than having to think of new business rules and routines, I’ve been left to just do programming. Imaginably, I’m a good bit better at that than I am at business management.

The first task to be tackled was that one of our distributors was starting to send out a new data feed for its inventory information; the problem there was that the feed was in a different format. Originally, their feed had a comma-delimited string showing their warehouse availability (like Y,Y,Y,Y, to show that it’s available in all four warehouses). However, their new feed doesn’t have column headers, and the warehouse indicators are individual columns. I handled this problem by importing the feed into SQL Server, doing string concatenation with commas on the indicator columns, and depositing the reformatted data into the usual holding table for the feed. However, we also needed to get this information into Microsoft Access, and Access was expecting column headers different from those I use in SQL Server. Further adjustments were done by exporting the holding table into a text file, then using PowerShell to do a string replace on the file contents to get the column headers right. The text file was then linked into Access, and it’s imported into the Access version of the table as a replacement for the old routine we had for this feed.

One regret I had on handling this problem was that I couldn’t use PowerShell to download the file from the distributor, which would’ve given us an easy all-in-one solution; when I attempted the download through PowerShell, even using an authenticated session, I’d download 6KB of the 2MB file, and the download would terminate. Not sure why it did that, but I’m guessing the distributor doesn’t want command-line connections to its datastore. The file’s not on an FTP server, either, so my usual routines for getting FTP data aren’t applicable. Still, it only adds a minute to the procedure to manually download the file, so it’s not too bad.

Next up was the process of finding the average shipping cost over the past 30 days for each item we’d sold over that time span. The trick here was that orders that had multiple items shouldn’t be used in the calculation, since differentiating the shipping costs in our database isn’t possible; they’re all summed together in the shipment information table. So, using a series of CTEs and aggregates, I was able to prune down the orders table to single-item orders, then join those orders’ items to the shipment info table, then sum up the shipment costs and average them out. It was a little tricky figuring it all out, since the shipment info table doesn’t have the item numbers in it, but doing the calculations separately and joining the results worked very nicely. It’s a great technique I’ve learned from a ton of reading at SQLServerCentral.com :).

The last task for this week, which will need to be done over the weekend, is doing some capacity planning for a new Access database file. Our master product tables are starting to bloat, and they’re causing our data file to hit the dreaded (and tiny) 2GB limit for Access files rather often, so I’m planning to move some of those tables into their own data file and re-point the links in our front-end to the new file. It should be a rather easy and painless procedure, and from what I’ve calculated, this should buy us a huge amount of extra breathing room. It would definitely be preferential if Access could abolish the 2GB limit on its files, but I’m guessing that would require rather drastic changes to the base coding for the program.

In any case, it feels quite good to get back to pure programming. I’d really rather do nothing else if I could, but the business is shorthanded these days, so everyone’s having to do everything. In fact, just last week, one of the accountants was being trained to do shipment preparation work, and one of the inventory crew was being trained to do accounting! I’m really not quite sure that’s the best idea, though.

Data Analysis

It seems that the business will be needing some new considerations on handling its inventory from this point forward. The owner would like to have a method developed where, when considerations for inventory restocks are made each day, suggestions could be made by the access layer as to what should be ordered and what should be discontinued. I believe there may be a number of problems with developing such a method, however.

The primary concern I have is that having automated suggestions on what to order and what not to order could lead to complacency that could be dangerous; if the suggestions made are happily assuming that an item will keep selling wildly despite the fact that the item was just released, then the suggestion to order more could be quite hazardous. Furthermore, the cutoff for when an item should stop being considered a major seller, and how, would be very tricky; do we consider an item to be a major seller just because it moved 100 units on the day of its release? What about our top item, which sees major activity perhaps once a week, but otherwise doesn’t sell much, if at all? There’s also considerations needed for seasonal items, such as swimming-related products, which are active for one quarter of the year and are otherwise completely dormant.

In my opinion, this is the sort of thing that would require a person to sit down and do hands-on analysis of the data we already have. We bought a vendor solution for this very problem earlier in the year, and it does an admirable job of showing our sales per item, item line, or selling channel for any given date range. It will also show the profitability of any given item for its sales, along with many other valuable pieces of data.

One could certainly question why the development of an entirely new process is needed to present data that we already can view in a very succinct format. The main reason is because the owner doesn’t have the time to sit down and thoroughly examine any one product line, or even the generalized view of all of our sales data. As such, he’d simply like little snippets of data and information, and suggestions based on those snippets, in an easily-readable form. However, I believe this is a mistake; what is truly needed is a data analyst. Looking at the big picture of our sales and pruning out items that aren’t performing well, while boosting our emphasis on items that are big sellers, should be our focus. Programming could make this easier, and it could automate the suggestion of ideas to facilitate the process further, but it can’t hope to replace an in-depth analysis from someone that would know the (frequently shifting) goals of the business and the demands of the current situation.

It would be delightful if I could indeed code up something that could reflexively adapt and predict the current market operations in our field, but I don’t believe that it’s something that’s possible at the moment. Rather, I feel that it would be best if we could dedicate an employee to doing in-depth examinations of the plentiful sales data that we have at present, but which is going completely unused. Hopefully I can convince management of this, but if not, I’ll have a rather convoluted and potentially error-prone suggestion engine to build.