A/B Testing FAQ

Frequently asked questions for A/B testing.

Q: How is A/B testing sorting rule inheritance related to category sorting rule inheritance?

A: They are directly related: always configure your A/B tests to take advantage of that inheritance. When configuring A/B tests, only select the top-level categories and rely on inheritance. Don’t select all subcategories in the category tree (explicitly assigning a sorting rule to each category).

Q: Is it true that you must add <varyby="price_promotion"> to each template that uses iscache for A/B testing to work properly?

For example, the SiteGenesis product-related templates already include this, but the slot content templates don't. To use a slot content as an A/B test experience, must I add this parameter to the slot content-related templates?

A: Caching is a complex topic, and there is no general rule for all templates. If you have a template that shows static content, cache that template for a reasonably long period, such as 60 minutes. If you have a template that is specific to each storefront customer, such as the cart contents, never cache that template. If you have a template that is personalized by price, promotion, search results order, or a custom A/B test experience, cache that template using <varyby="price_promotion">.

If you have an unpersonalized home page, for example, cache it for 60 minutes. The A/B tests still work for the site. Storefront customers are placed in A/B test segments as expected despite this cached page, whether they view it right away or never see it.

If your search uses a sorting rule, ensure your search results page accounts for the possibility of personalized sorting rules by caching it with <varyby="price_promotion">. Even if you have no A/B tests or no A/B tests with sorting rule experiences, do this. There could be campaigns that personalize the sorting rule for a search because of customer group, source code, or coupon qualifiers.

Using <varyby="price_promotion"> to cache pages isn't really about A/B tests, but about personalization. A/B tests are one way to personalize. Slot caching is a different way. Slot content personalization depends only on the customer groups and A/B test segments that apply for a storefront customer, and not the promotions or price books.

You cache slot content for 60 minutes, for example, even if that content only appears to a specific customer group or a specific A/B test segment. Salesforce B2C Commerce only renders that content specifically for that group or segment, and everyone in that group or segment sees the same content. For example, only a specific customer group or A/B test segment sees the slot content, and that content includes the storefront customer's name. It could not be cached for 60 minutes because everyone would see the same name. There are no personalized slot content templates in SiteGenesis, which is why <varyby="price_promotion"> doesn't appear there.

Q: Why are there discrepancies between A/B Test metrics observed in Commerce Cloud B2C vs. external analytics tools?

A: Each Analytics tool has its own, usually proprietary, way of tracking the storefront. The logic of external analytics tool is not disclosed to us, so we can’t help with comparing metrics. However, there are some common causes that could explain the discrepancies between analytics tools.

If you have multiple A/B tests running concurrently in B2C, the shopper is assigned to the first test that they qualify for (usually when they land on the homepage or category page). Once assigned, they’re omitted from any other A/B tests. This omission from further B2C A/B tests can cause Site Optimizers to see more visits to their other test pages in external tools. However, only a subset of those shoppers actually receives the test and are tracked in SFCC.

If the participation trigger is set as Per Session or Per Customer, the shopper is reconsidered for an A/B test each time they return to the site.

If you're using plug-ins, you can experience issues with your analytic’s data. For example, you could receive an artificial inflation of Facebook-referring traffic and an increase in pageviews. Some plug-ins redirect when a user interacts with the plug-in, even when the plug-in only appears on the page. The redirect goes to a third site and returns the user to the storefront site. The redirect doesn't pass the referrer, so the traffic sources report show that the visit originated from the third site, overwriting the correct campaign or traffic source.

These plug-ins can be affected:

  • Activity Feed
  • Recommendations
  • Like Button

For the SiteGenesis or Mobile web Storefront applications, the Like button is supposed to be an inactive display element until a user clicks it. However, it actually makes a call to Facebook that issues another call-back to the referenced product detail page. When a customer views a product, the Like button causes a second visit to the product detail page and is counted as another A/B test visitor. Because this is a consistent factor across all segments, it does not change the ratio for the test results.

Finally, bot traffic can also cause skewed results in A/B testing compared to other analytics tools.

Q: Are any A/B testing parameters enforced by quotas?

A: Quotas enforce one A/B testing limitation: Maximum of five test segments within a test, including the control group segment, so four user-defined segments total.

Note: Best practice is to edit and configure A/B tests on staging. After testing, reviewing, and approving your A/B tests, you can replicate the tests to production.

Q: You can create or import an unlimited number of active A/B tests, yet only a limited number of them can be truly active. How does B2C Commerce select the active tests?

A: The Business Manager selects the active tests by:

  • Comparing their end dates and selecting the tests that end first
  • If they have the same end dates, comparing their start dates and selecting the ones that started the most recently

This determination is dynamic. If a test is enabled or newly created, it could immediately take precedence over one of the others. B2C Commerce allows a maximum of 12 A/B test segments for all active A/B tests. A single A/B test can have up to four segments plus the control group.

Q: Is Active Data tagging a pre-requisite to using A/B testing?

A: No. Active data isn't a requirement, but it's recommended. For example, active data can be used for sorting rules.

Q: A/B testing setup is done on staging and then replicated to production. Must I also reindex?

A: No. Reindexing is not necessary.

Q: Is it true that the more tests you run, the more diluted your participant pool gets?

A: Running tests one after the other has no effect on the participant pool. Running tests simultaneously results in the tests sharing the possible participants, because each customer can participate in only one test at a time. The more tests run on the site, the fewer participants in each test. Fewer participants mean a smaller sample size, which often requires tests to run longer. However, there can be only three simultaneous A/B tests in B2C Commerce.

Q: What experiences does B2C Commerce support?

A: B2C Commerce supports experiences with promotions, content slots, and sorting rules. In addition, we support script customization where you can define other, custom experiences.

Q: Do you recommend running one test for a campaign with multiple experiences per segment, instead of running two separate tests? For example, one test with a content slot and promotion or two tests, one with a content slot and the other has the promotion.

A: The answer depends on what you want to test. If you want to test independently the effect of slots versus the effect of sorting rules, you would run separate tests. If you want to test the combined effect of a strategy that includes both slots and sorting rules, you would run a single test. If you run a single test, you can't tell if the results are because of the slots, the sorting rules, or a combination of both. Run individual tests to better understand this level of detail.

Q: Do parallel running tests influence each other?

A: Yes and no. Yes, if experience overlap, because users don't participate in other tests, which can skew the results. No, if experiences don't overlap. However, if you measure against the same metric, for example, conversion rate, it's difficult to argue that either experience caused an increase. Also, it takes longer until the tests are conclusive because there are more test segments into which the traffic is distributed. In general, we recommend that you run one test at a time, to eliminate all doubt of other influences.

Q: If I have an A/B test for slots on the homepage, how do I determine the percentage to allocate for segments versus the control group?

A: The reasons to select one percentage over another depends on the reason for the test and its configuration. For example, a merchant wants to test content slots on the homepage, comparing conservative content with something risky or provocative. The merchant chooses a low allocation for the group seeing the risky content. Alternatively, to test two versions of the same slot with minor differences, such color or font, a merchant could select equal percentage (50/50 or 33/33/33) allocations. Using a lower allocation results in a smaller sample size of participants in that segment, which on average requires you to run the test for a longer period. You could run the risky content with a 50/50 allocation for a shorter test. However, with a smaller percentage allocation, you can pull the test faster if it generates a negative reaction, with fewer users having seen the risky content.

Q: What events cause an email notification to be sent, and how frequently are they sent?

A: B2C Commerce sends email notifications when:

  • The A/B test ends.
  • The results of the test become statistically significant.
  • The A/B test is paused.
  • The A/B test is resumed.

Email notifications are sent hourly.